ConfSeer: leveraging customer support knowledge bases for automated misconfiguration detection

  • Rahul Potharaju ,
  • Joseph Chan ,
  • Luhui Hu ,
  • Cristina Nita-Rotaru ,
  • Mingshi Wang ,
  • Liyuan Zhang ,
  • Navendu Jain

VLDB Endowment | , Vol 8(12): pp. 1828-1839

DOI

We introduce ConfSeer, an automated system that detects potential configuration issues or deviations from identified best practices by leveraging a knowledge base (KB) of technical solutions. The intuition is that these KB articles describe the configuration problems and their fixes so if the system can accurately understand them, it can automatically pinpoint both the errors and their resolution. Unfortunately, finding an accurate match is difficult because (a) the KB articles are written in natural language text, and (b) configuration files typically contain a large number of parameters with a high value range. Thus, expert-driven manual troubleshooting is not scalable.

While there are several state-of-the-art techniques proposed for individual tasks such as keyword matching, concept determination and entity resolution, none offer a practical end-to-end solution to detect problems in machine configurations. In this paper, we describe our experiences building ConfSeer using a novel combinations of ideas from natural language processing, information retrieval and interactive learning. ConfSeer powers the recommendation engine behind Microsoft Operations Management Suite that proposes fixes for software configuration errors. The system has been running in production for about a year to proactively find misconfigurations on tens of thousands of servers. Our evaluation of ConfSeer against an expert-defined rule-based commercial system, an expert survey and web search engines shows that it achieves 80%-97.5% accuracy and incurs low runtime overheads.