Applying Large-Scale Text Analytics with Graph Databases to Visualize Entity and Relationship Inferences
Share this Session:
  Trung Diep   Trung Diep
DOCOMO Innovations, Inc.
  Ronald Sujithan   Ronald Sujithan
Software Architect
DOCOMO Innovations, Inc.
  Alan (Zhe) Wu   Alan (Zhe) Wu


Tuesday, January 31, 2017
03:45 PM - 04:30 PM

Level:  Case Study

A vast amount of electronic texts exist across the public Web and in private communications. These texts can occur in many forms ranging from formal documents to informal messages. The ability to understand the semantic content is an overwhelming task for humans. Using NLP technologies along with graph analytics is critical to converting free-form texts into structured data that can be used to make entity and relationship inferences. The Data Ninja services, which provide a set of cloud-based APIs, can extract entities from the document texts, as well as their relationships, and produce RDF triples which can be populated into an Oracle Spatial and Graph database. In this talk, we will demonstrate an example using recent news articles about the Zika virus to produce actionable insights with Oracle, an enterprise-grade graph database, and the semantic content produced by the Data Ninja services.

Trung Diep currently directs the engineering team responsible for delivering the quality and performance of the Data Ninja services at DOCOMO Innovations. Prior to joining Docomo Innovations, Trung has previously worked at Intel, Mercury Interactive, Rambus, and Broadcom, where he was an Associate Technical Director. He received his B.S. and B.A. degrees in Electrical Engineering and Computer Science, respectively, from Rice University and M.S. and Ph.D. degrees in Computer Engineering from Carnegie Mellon University. His research interests span from hardware, such as processor performance modeling and simulation, to software, such as cloud computing technologies, and particularly on the meta layer in which hardware interacts with software. Trung has been granted more than 10 patents covering the areas of branch prediction, multicore arbitration and scheduling, user-level threading, cache memory, memory wear-leveling, and memory encryption.

Ronald Sujithan is a Software Architect with extensive experience in developing machine learning systems, building innovative Big Data solutions with a particular focus on Text Analytics and creating RESTful APIs from first principles. He was responsible for architecting the RESTful APIs for Data Ninja services at DOCOMO Innovations using a number of cutting-edge technologies. As an independent consultant, he has worked with several startups as well as established companies such as VISA, comScore, Dell EMC and VMware. He received a Ph.D. in Computer Science from Oxford University. His research interests are in Parallel Computation, Database Systems, Data Mining and Natural Language Processing.

Zhe Wu is an architect working on semantic and graph technologies in Oracle USA. He leads the design, architecture, and development of the inference engine for W3C RDFS/SKOS/OWL in the database, Java APIs for RDF Semantic Graph, RDF triple-level security, SQL-based graph analytics, RDF Graph for Oracle NoSQL Datatabase, Property Graph for RDBMS and Hadoop, and more. As an Oracle representative, he has participated in the W3C OWL 2 and RDF working groups. He has previously served as a member of the program committee for ESWC 2011, OrdRing 2011-2014, ISWC 2010-2014, RR 2010, and OWLED 2008-2014. Zhe has served as the co-chair for JIST 2011, and he has served on the editorial board of SWJ 2010 (special issue on real-world applications of OWL). He has also served on the UDDI standard committee from 2003 to 2005. Zhe received his PhD in computer science from the University of Illinois at Urbana-Champaign in 2001. He received his BE from the Special Class for Gifted Young, USTC, in 1996.

Close Window