You are on page 1of 12

Nicole Michel

International Project Manager


Office International Natural Language Technologies

December 2014

Grammar Checking in Office Tech


Preview
Background
Microsoft is in the process of replacing third party Proofing tool components such as
Spellchecker, Thesaurus, Grammar Checker and others with replacements built
using Microsoft technology. This project extends over multiple versions of Office and
will facilitate quicker updates/fixes to our Proofing Tools, eliminate security risk and
reduce dependency on suppliers.
For Office 2013 we have introduced a new generation of Advanced Proofing for 6
languages. Advanced Proofing is a collection of features (punctuation checking,
grammar checking and contextual spell checking) that analyse text to identify
proofing errors and highlight those errors using blue underlines in Office 2013 and
later (or green underlines in Office 2010). Advanced Proofing complements the
Spellchecker, which identifies misspelt words and highlights them using red
underlines.
The remaining 3rd party Grammar Checkers will be replaced with Advanced Proofing
gradually in the next versions of Office.

The trickiest part of Advance Proofing is Grammar Checking as it is a complex


undertaking and requires the machine to unambiguously identify all components of
a sentence. For each language, one can build an unlimited amount of different
sentences by combining different words in different order. With the current
technology it's not possible to create a grammar checker that correctly assesses all
possible sentences in a language. Due to the complexity of sentence structures and
ambiguity inherent to language (e.g. different readings of the same word), the
component may sometimes also flag (add a squiggle to) correct sentences or not
flag a sentence with an actual grammar error. Our goal is to strive towards flagging
actual errors as much as possible while not flagging correct sentences. In order to
do so we have selected and implemented high accuracy grammar rules to target
those grammar errors found in vast corpora of real world user data in order to be as
helpful as possible to the user. Using real-world text ensures that our features focus
on the errors that authors actually make, rather than errors which may appear in a
grammar textbook, but which are very uncommon in reality.
Microsoft Office customers told us that they prefer an Advanced Proofing experience
which is more precise. This means that more of the items highlighted to them are
true errors, which need to be corrected, and fewer to be warnings about style
1

Nicole Michel
International Project Manager
Office International Natural Language Technologies

December 2014

choices, or items, which are not errors at all. Avoiding stylistic revisions, which focus
on improving the manner in which the ideas are conveyed, rather than correcting
obvious mistakes of language usage or grammar was a conscious decision. If the
user has chosen a particular style, e.g. the passive voice, the use of a fragment,
long sentence, or foreign word may be intentional and flagging it as an error
contributes to the perception of the Grammar Checker as inaccurate. Users would
also like more help to fix the errors highlighted to them, by accompanying each
error with a suggested correction. High accuracy and good suggestions are the two
key quality measures for the new generation of Advanced Proofing features being
introduced to Microsoft Office.
Our new Advanced Proofing tools may intentionally offer a different set of grammar
rules than were present in the components they replace. This also means that we
have removed some of the less accurate grammar rules and for some languages we
have added other grammar rules or a Context Sensitive Spelling component, which
identifies correct words used incorrectly in the context.
On balance however, we hope and expect that users will find the new Grammar
Checker components less noisy and more helpful than the old ones.

Q7 Office Desktop Preview Program (late October 2014)


In the QR7 Office preview program the following new Grammar Checkers were
available for download in the Proofing package (exe including the MSI with all
proofing tools):

Finnish
Dutch
Italian
Russian
Danish
Arabic
Brazilian Portuguese

And replaced the 3rd party Grammar Checker with a Context Sensitive Speller
component for:

Brazilian Portuguese
Norwegian Bokml
Italian

Additional in Q8 Office Desktop Preview Program (mid-January 2015)


The same languages as in Q7 will continue to be available and were currently
working on new Grammar Checkers for the following languages, which are included
in the Q8 Office Desktop preview program:
2

Nicole Michel
International Project Manager
Office International Natural Language Technologies

December 2014

English (Beta version)


German (Beta version)
Spanish (Beta version)
European Portuguese
Update for Finnish
Update for Brazilian Portuguese

The Grammar Checkers for English, Spanish and German are not complete yet as
some Grammar rules are not ready to ship.
Please see the Appendix at the end of this document (page 6) to get more detailed
information on what grammar rules are already available and which ones are still
under development.

The components will be available for download in late January most likely at the
following location: http://www.microsoft.com/en-us/download/details.aspx?id=44310
You will need to dogfood the latest Office Gemini Desktop build to be able to use
those components. The Grammar Checkers currently only ship in Desktop and not
with Office Online or the Office Universal Apps.

Ask of Dogfooders
Once you have installed the latest Office Gemini build (also known as Office 16)
please install the Language package (Proofing MSI) for the languages youre a
native of or fluent in, use the new Grammar Checkers and report back any feedback
or suggestions you may have. The Proofing Tools are turned on by default and
Grammar Checkers are automatically enabled for Microsoft Word once installed.
Instructions to install the Proofing package (Proofing MSI) for your language:
1. You can either download the Proofing MSI directly from the Microsoft Proofing
Tools Beta page (http://www.microsoft.com/en-us/download/details.aspx?
id=44310) or you can perform the following steps:
2. Open a Word document and change the Proofing Language to your language
(e.g. Italian)
a. To change the Proofing language: Highlight some text in the document
and then click on the language name in the left bottom corner of your
Word document:

b. A new window will pop-up where you can choose your language:

Nicole Michel
International Project Manager
Office International Natural Language Technologies

December 2014

NOTE: Ensure that Do not check spelling and grammar is indeed


unticked as above.
c. If you dont have Proofing Tools installed yet for your language it will
prompt you to download them through the Business bar on top of the
Word document:

d. Click Download to install the Proofing MSI.

If you want to have a closer look at the Grammar Rules that are available for your
language you can do the following:
1. Open a Word document where the Proofing Language is set to your language.
a. The Proofing Tools already need to be installed on your machine for the
next steps to succeed
b. Please ensure to open the Word document in Edit mode or choose
Enable Editing to get to Edit mode.
2. Go to File >> Options >> Proofing. The Word Options window will pop-up. In
this window please click on Settings in the section When correcting
spelling and grammar in Word:
4

Nicole Michel
International Project Manager
Office International Natural Language Technologies

December 2014

3. A new window will pop up that shows you the available Grammar Rules for
your language:

EXAMPLE FOR THE OLD ENGLISH GRAMMAR CHECKER.

The Grammar Checkers are also included with Outlook, but are turned off by
default. To turn them on, follow these steps:
Go to File -> Options -> Mail -> under "Compose messages" click on Spelling and
Autocorrect This brings you to the Proofing Option window where you can tick
"Mark Grammar errors as you type" to enable the Grammar Checkers.

Nicole Michel
International Project Manager
Office International Natural Language Technologies

December 2014

There are no particular scenarios that should get tested. Just use Microsoft Word
as usual to write your documents, articles, letters etc. and focus on the Grammar
Checker flags. Let us know e.g.:

if it finds the grammar errors you frequently make (the blue squiggles
below a word or a phrase are from the Grammar Checker)
if it provides helpful suggestions or not when you right click on the
highlighted error
whether it misses many of the frequent errors you make (no squiggle
appears)
or whether it frequently highlights sentences that are correct
whether you miss any squiggles that the previous Grammar Checker
generated in previous versions of Office, but not the new one
etc.

NOTE: The grammar rules we choose to implement are targeted at finding the
grammar errors that users frequently make. We have collected huge real world data
sets to analyse and extract frequent error patterns. Therefore you will find that our
grammar checkers wont cover every single grammar rule that you may find in a
typical grammar book for your language and that is intentional. Its not feasible to
address every possible grammar error one can think of and we therefore focus on
targeting the most frequent ones to help a broad user base.
Any other comments you may have regarding the Grammar Checker User Interface,
or competitor products you may know or use are highly appreciated as well.
The goal is to find and fix major issues before the new Grammar Checkers go out to
a broad user base.

Appendix
German
Available in the current German Proofing Tools MSI:

Grammar
Option

Grammar Rules

Example Errors targeted by


the rules in the Option

Groschreibung von
Substantiven

Unrichtige
Groschreibung
Substantiv
kleingeschrieben

Sie ist Morgens immer mde.

Kongruenz

Kongruenz in der
Nominalphrase

Die schn Wohnung ist teuer.


Das sehr alte Baum war nicht mehr

Sie gab darauf grte acht.

Nicole Michel
International Project Manager
Office International Natural Language Technologies

December 2014

standsicher.
Wohlgeformtheit

Leicht
verwechselbare
Wrter
Grogeschriebene
Adjektive
Verwendung von Mal

Wie seit ihr denn telefonisch


erreichbar?

Verwendung von
derselbe
Verwendung von
irgend bei Pronomina
Bindestriche
Verben mit einem
trennbaren Prfix

Der Himmel ist berall der selbe.


Irgend eine berraschung gibt es
immer.
Sie knnen sich per E Mail
anmelden.

Interpunktion

Interpunktion

Wir haben das Turnier gewonnen!.

Leerzeichen

Zu viele Leerzeichen
Leerzeichen vor oder
nach Klammern

Die Zeile war besonders lang.


In der Notiz steht, dass das Picknick(
nur fr Mitarbeiter)

Zusammengesetzte
Wrter

Das kann etwas richtig fantastisches


werden.
Ich habe das zwei mal gemacht.

Wir htten die Kinder zuerst ab


holen sollen.

Still in development and only available in a future refresh of the


Office Tech Preview:

Grammar
Option

Kongruenz

Verwendung von
Kommas

Grammar Rules

Example Errors
targeted by the rules in
the Option

Kongruenz zwischen Subjekt Die Arbeitszeit betragt


und
mindestens 20 Stunden pro
Monat.
Verb

Komma im Relativsatz

Menschen die recht haben,


stehen meistens allein.

Nicole Michel
International Project Manager
Office International Natural Language Technologies

Komma im Nebensatz

Wohlgeformtheit

December 2014

Ich brauche einen Fotoapparat


um meine Hunde zu
fotografieren.

Verwechslung von dass und Ich hab mein Ziel und dass will
das
ich erreichen.
Also ich hab mir gedacht das
Sonntag Abend eine gute Zeit
wre?

Spanish
Available in the current Spanish Proofing Tools MSI:

Grammar
Option

Grammar Rules

Example Errors targeted


by the rules in the
Option

Concordancia de
los pronombres

Concordancia del cltico

Ella le sonre a sus padres.

Confusin de
palabras

Confusin de verbos
Confusin de palabras

Como le comente, el trabajo ya


est listo.
Ojala todo sea cierto.

Uso incorrecto de
'de que'

Uso incorrecto de 'de que'

Yo pienso de que todo ir bien.

Verbo impersonal

Verbo impersonal

Pueden haber problemas.

Uso incorrecto de
maysculas

Uso de maysculas

Ir a verte el Martes.

Puntuacin

Signos de exclamacin e
interrogacin
Uso de la coma

Quin descubri Amrica?


Dijo que si quieres, vendr
maana.

Nicole Michel
International Project Manager
Office International Natural Language Technologies

Uso de espacios

December 2014

Signos de puntuacin
consecutivos

No lo s;, creo que es Bach.

Sobra un espacio
Falta un espacio
Espacio mal colocado
Ms de un espacio entre
palabras

La hora de la comida ( solo para


empleados) ha cambiado.
Todos queremos sillas
nuevas,mejor comida y un
horario ms flexible.
As es !Tengo que revisar el
resultado yo mismo.
La lnea era muy larga.

Still in development and only available in a future refresh of the


Office Tech Preview:

Grammar
Option

Grammar Rules

Example Errors targeted by


the rules in the Option

Concordancia del
adjetivo o participio

Concordancia entre
sujeto y atributo

Nosotros fuimos investigado.

Concordancia del
sujeto con el verbo

Concordancia del
sujeto con el verbo

Los vveres se acabar la semana


prxima.

Concordancia en el
grupo nominal

Concordancia entre
determinante y
nombre
Concordancia entre
nombre y adjetivo

Este vveres son muy caros.


Qu coches tan bonita.

Nicole Michel
International Project Manager
Office International Natural Language Technologies

December 2014

English
Available in the current English Proofing Tools MSI:

Grammar
Option

Grammar Rules

Example Errors
targeted by the rules in
the Option

Misused Words

A or An

We waited there for at least a


hour.

Real vs. Really


He is driving real carefully.
Confusable words
Could you please advice me?
Comparative Use
Too many determiners

This is more bigger than I


thought.
I gave you a the carrot

Hyphenation

Capitalization in
Sentences

Hyphen Use

Capitalization in Titles
Capitalization of Common
Nouns

Our five year old son is already


learning to read

Of Mice And Men is a novel by


John Steinbeck.
He is a proud alumnus of the
university of Wisconsin.

Capitalization after a Comma


Despite searching everywhere,
The keys were nowhere to be
found.

Punctuation

Punctuation is succession

We won the tournament!.

Spacing

Spacing before and after

The picnic(employees only )


10

Nicole Michel
International Project Manager
Office International Natural Language Technologies

December 2014

was rescheduled.
Punctuation

The line was extra long.

Too many spaces between


words

Verb Use

"Of" instead of "have"

I could of known that.

Still in development and only available in a future refresh of the


Office Tech Preview:

Grammar
Option

Grammar Rules

Example Errors
targeted by the rules in
the Option

Noun phrases

Number agreement

I would like to buy this apples.

Possessives and
Plurals

Possessive and Plural Forms My mothers house is huge.

Subject-Verb
agreement

Number agreement
between

The men walks to the bar.

subject and verb

Comma Use

Comma instead of
semicolon

Also, I know Mike, and hes not


one bit like that, hes usually
either quiet or really funny.

Comma missing after


11

Nicole Michel
International Project Manager
Office International Natural Language Technologies

conditional
clause
Redundant comma in
conjoined

December 2014

If I have missed
anything
please identify what you would
like to see in the statement.
Could you provide the freight
number, and the company who
is delivering.

phrases

Questions

Missing question mark

Are you agreeing with the


above.

12

You might also like