


Also, directly from the UI, users can remove rows containing missing values or substitute them with a computed default value.

To delete a column, for example, a right-click on a column heading will generate the necessary Python code to do that. In this case, it's PROSE, an AI-powered program synthesis technology. Data Wrangler in Action (source: Microsoft).ĭata Wrangler uses code-generating techniques that are becoming popularized with the advent of advanced AI coding assistants. How time consuming? Microsoft pointed to the Anaconda State of Data Science Report 2022 in which survey respondents (Python data scientists using the Pandas dataframe library) indicated they spend about 37.75 percent of their time on data preparation and cleansing, with data visualization - critical to interpreting results - also taking up a big chunk of time. The idea is to get the time-consuming, tedious stuff out of the way so data scientists can more quickly get about their business, like gleaning actionable business insights from corporate data. Microsoft describes the VS Code Insiders preview as "the first step towards our vision of simplifying and expediting the data preparation process on Microsoft platforms." The Data Wrangler extension works with the favorite programming language of data scientists, Python, and the associated open source Pandas library to enhance the data preparation process: exploring, manipulating/cleansing and visualizing data. A new tool being previewed in the Visual Studio Code Insiders channel can generate code to ease the tedious data preparation process that data scientists need to go through to get good data for successful analysis projects.
