links

Introducing the Knowledge Graph: things, not strings blog.google

Google's Knowledge Graph is a structured entity database that maps real-world objects - people, places, organisations, and concepts - to semantically rich attribute sets and inter-entity relationships, replacing string-matched keyword lookup with disambiguated, meaning-based retrieval. The system resolves lexical ambiguity (e.g., "Taj Mahal" as monument vs. musician vs. restaurant) by anchoring queries to canonical entities with unique identifiers, drawing from synthesised sources including Freebase, Wikipedia, and the CIA World Factbook to populate typed properties and relational edges. This shifts ranking and indexing logic from document-to-keyword co-occurrence toward entity-to-entity graph traversal, enabling query expansion, direct answer surfacing, and contextual result clustering without requiring exact-match signals in crawled content.

Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion noon99jaki.github.io pdf

Web-scale probabilistic knowledge base that automatically fuses extracted facts from Web content with prior knowledge from existing knowledge bases (Freebase, OpenCyc, Wikidata) using a supervised machine learning pipeline combining extractions, graph-based inference, and calibrated confidence scoring. The system ingests 1.6 billion candidate facts, assigns calibrated probabilities via classifier ensembles and embedding-based propagation, and achieves a corpus of 271 million facts with ≥0.7 confidence—surpassing Freebase's human-curated 350 million facts in breadth while maintaining measurable precision. This architecture enables automated, continuously updated entity-attribute resolution at crawl scale, directly powering entity disambiguation, Knowledge Graph population, and confidence-weighted fact retrieval without reliance on manual curation bottlenecks.