{"id":611607,"date":"2019-08-14T00:00:08","date_gmt":"2019-08-14T07:00:08","guid":{"rendered":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/?post_type=msr-research-item&#038;p=611607"},"modified":"2019-09-30T17:36:47","modified_gmt":"2019-10-01T00:36:47","slug":"diff-a-relational-interface-for-large-scale-data-explanation","status":"publish","type":"msr-video","link":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/video\/diff-a-relational-interface-for-large-scale-data-explanation\/","title":{"rendered":"DIFF: A Relational Interface for Large-Scale Data Explanation"},"content":{"rendered":"<p>A range of explanation engines assist data analysts by performing feature selection over increasingly high-volume and high-dimensional data, grouping and highlighting commonalities among data points. While useful in diverse tasks such as user behavior analytics, operational event processing, and root cause analysis, today\u2019s explanation engines are designed as standalone data processing tools that do not interoperate with traditional, SQL-based analytics work\ufb02ows; this limits the applicability and extensibility of these engines. In response, we propose the DIFF operator, a relational aggregation operator that uni\ufb01es the core functionality of these engines with declarative relational query processing. We implement both single-node and distributed versions of the DIFF operator in MB SQL, an extension of MacroBase, and demonstrate how DIFF can provide the same semantics as existing explanation engines while capturing a broad set of production use cases in industry, including at Microsoft and Facebook. Additionally, we illustrate how this declarative approach to data explanation enables new logical and physical query optimizations. We evaluate these optimizations on several real-world production applications, and \ufb01nd that DIFF in MB SQL can outperform state-of-the-art engines by up to an order of magnitude.<\/p>\n<p>This is joint work with Peter Kraft, Sahaana Suri, Edward Gan, Eric Xu, Atul Shenoy\u2020, Asvin Ananthanarayan\u2020, John Sheu\u2020, Erik Meijer\u2021, Xi Wu\u00a7, Jeff Naughton\u00a7, Peter Bailis, Matei Zaharia at Stanford, Facebook (\u2021), Google (\u00a7), Microsoft (\u2020).<\/p>\n<p><a href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2019\/09\/DIFF-A-Relational-Interface-for-Large-Scale-Data-Explanation-SLIDES.pdf\">[SLIDES]<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A range of explanation engines assist data analysts by performing feature selection over increasingly high-volume and high-dimensional data, grouping and highlighting commonalities among data points. While useful in diverse tasks such as user behavior analytics, operational event processing, and root cause analysis, today\u2019s explanation engines are designed as standalone data processing tools that do not [&hellip;]<\/p>\n","protected":false},"featured_media":611796,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[13563],"msr-video-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-611607","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-research-area-data-platform-analytics","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/dWEvtuxqbfk","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/611607","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":4,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/611607\/revisions"}],"predecessor-version":[{"id":611799,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/611607\/revisions\/611799"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media\/611796"}],"wp:attachment":[{"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=611607"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=611607"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=611607"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=611607"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=611607"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=611607"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=611607"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=611607"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=611607"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=611607"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}