BERT Algorithm: Google's Natural Language Understanding Breakthrough
What is BERT
BERT (Bidirectional Encoder Representations from Transformers) is a natural language processing (NLP) model published by Google as a research paper in 2018 and applied to search in October 2019. It is considered one of the largest leaps in Google search history.
Before BERT, search engines processed words independently or read left to right only. BERT, as the name suggests, reads the entire sentence bidirectionally at once to grasp each word's meaning in context.
Meaning of BERT's arrival: At launch, Google called it "the biggest advancement in the last five years, one of the largest leaps ever."
The problem BERT solved: prepositions and context
A representative example shows BERT's effect.
Query: "2019 brazil traveler to usa need a visa"
- Before BERT: Processed around "usa visa" keywords → returned information about Americans applying for visas
- After BERT: Grasped directionality of "to usa" → returned visa information for Brazilians visiting the US
Understanding that a single preposition "to" completely changes query intent. In morphologically rich languages, it accurately grasps particle and case-marker context that shifts meaning.
Core technical principles of BERT
1. Bidirectional training
While models like GPT read text unidirectionally left to right, BERT reads the entire sentence at once, reflecting both left and right context for each word.
Example: The meaning of "cross" in "cross the river" requires both preceding and following context to be fully understood.
2. Masked Language Model
During training, 15% of a sentence is randomly masked, and the model is trained to predict masked words. This builds ability to understand full sentence context.
3. Next Sentence Prediction
The model is trained to judge whether two sentences are consecutive or unrelated, understanding relationships between paragraphs and logical structure in long documents.
BERT's impact on search
At launch, BERT affected about 10% of English searches and later expanded to 70+ languages.
| Impact Area | Change |
|---|---|
| Long queries | Greatly improved understanding of complex 5+ word queries |
| Conversational search | Improved handling of conversational queries like "how," "why," "better than" |
| Prepositions and particles | Accurately grasps directionality and relationships by context |
| Featured snippets | Improved featured snippet quality through accurate intent understanding |
| Negation | Understands negative intent like "without X," "not doing X" |
BERT's practical impact on SEO
Changes required in SEO strategy after BERT:
Strengthened end of keyword stuffing
Simple keyword repetition became ineffective. BERT favors content written in natural language because it understands context.
Search intent optimization is key
The same topic can have different intent depending on query context ("compare," "how to," "price," "review," etc.). After BERT, search intent matching became more important.
Long-tail keyword opportunities
BERT better understands specific, complex queries. Providing accurate answers to 3–5+ word specific queries increased exposure opportunities.
Write in natural language
Writing naturally like "What are the best tools for SEO in 2024?" is more BERT-friendly than listing keywords like "best SEO tools 2024."
From BERT to MUM: evolution
BERT evolved one step further into MUM (Multitask Unified Model) in 2021.
| Comparison | BERT | MUM |
|---|---|---|
| Language understanding | Single language, bidirectional | 75+ languages processed simultaneously |
| Multimodal | Text only | Text and images processed together |
| Complexity | 1.1 billion parameters | 1,000x more powerful |
| Application | Query understanding | Complex query processing, AI answer generation |
Google AI Overviews (SGE) today are built on MUM and Gemini models; BERT still operates as the basic query understanding layer.
BERT and Korean search
BERT was applied to Korean search through multilingual BERT (mBERT). Characteristics of Korean BERT:
- Processes Korean's agglutinative nature (particles, ending variations) at morpheme level
- Recognizes directional differences like "from Seoul to Busan" vs "from Seoul toward Busan"
- Improved understanding of colloquial Korean queries ("how do I do this?")
Naver develops its own Korean language models such as HyperCLOVA for Naver search, operating a natural language understanding system independent of Google.
Frequently asked questions
Q. My site rankings dropped after BERT. What should I do?
A. Pages affected by BERT often have content that mismatches search intent. Reconfirm what users actually want for that keyword (How-to, definition, comparison, purchase, etc.) and rewrite content to match that intent.
Q. How do I write BERT-optimized content specifically?
A. The most important thing is writing in natural sentences. Write as you would speak to readers, place a direct answer in the first paragraph, and cover varied query forms in FAQ format for effectiveness.
Q. How is BERT different from AI models like ChatGPT?
A. BERT is an encoder model for query understanding; GPT-family models are decoder models that generate text. Google Search uses BERT to grasp intent and Gemini-like generative models to create answers in AI Overviews.
Q. Is BERT still used, or fully replaced by MUM?
A. Both are used. MUM handles complex multi-step and multimodal queries; BERT handles fast basic query understanding. Google's search system combines multiple models in layers.
Q. Does Korean content benefit from BERT?
A. Yes. Google's multilingual BERT (mBERT) was applied to Korean and contributed to Korean search quality improvement. Naturally written Korean content is rated higher than content that forcibly lists keywords.
Related sources
- Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Google AI Language. https://arxiv.org/abs/1810.04805
- Nayak, P. (2019). Understanding searches better than ever before. Google Blog. https://blog.google/products/search/search-language-understanding-bert/
- Google Search Central (2024). How Google's ranking systems work. https://developers.google.com/search/docs/appearance/ranking-systems-guide
이 페이지를 참조하는 항목
- 📘ConceptBERT Algorithm: Google's Natural Language Understanding Breakthrough
- 📘ConceptGoogle Core Update: Understanding and Response Strategy
- 📘ConceptMUM Algorithm: Google's Multimodal Search Understanding Engine
- 📘ConceptPassage Ranking
- 📘ConceptEntity SEO: From Keywords to Concepts in Search
- 📘ConceptGEO Master Guide: 5-Area Checklist
- 📘ConceptSemantic Search: Understanding and Optimizing Meaning-Based Search
- 📓ComparisonSEO vs AEO vs GEO: What Is the Difference?
- 📘ConceptWhat Is GEO?
- 📘ConceptWhat Is SEO?
- 📘ConceptBlack Hat SEO
- 📘ConceptThin Content
- 📘Concept4 Types of Search Intent