Data Scientists and Data Wranglers often have existing code that they would like to use at scale over large data sets. In this presentation, we show how to meet your customers where they are, allowing them to take their existing Python, R, Java, code and libraries and existing formats—for example Parquet—and apply them at scale to schematize unstructured data and process large amounts of data in Azure Data Lake with U-SQL. We will show how large customers meet the challenges of processing multiple cubes with data subsets to secure data for specific audiences using U-SQL partitioned output, making it easy to dynamically partition data for processing from Azure Data Lake.
To become proficient in authoring, debugging, and optimizing U-SQL code in Azure Data Lake Analytics, a developer must master the key concepts that underlie query execution. By using demos, this session dives deep into the underlying layers of the Data Lake Analytics service to explore fundamental query execution and performance concepts. Presented by: Saveen Reddy