You are on page 1of 48

DomainSpecificLanguagesin

Python
SiddhartaGovindaraj
siddharta@silverstripesoftware.com
WhatareDSLs?
Specialized mini-languages for specific problem
domains that make it easier to work in that
domain
!ample"S#L
SQL is a mini language specialized to retrieve data
from a relational database
!ample"$egular!pressions
Regular Expressions are mini languages
specialized to express string patterns to match
LifeWithout$egular!pressions
def is_ip_address(ip_address):
components = ip_address_string.split(".")
if len(components) != 4: return False
try:
int_components = [int(component) for component in
components]
except alue!rror:
return False
for component in int_components:
if component " # or component $ %&&:
return False
return 'rue
LifeWith$egular!pressions
def is_ip(ip_address_string):
matc( = re.matc((r")(*d+,-./).(*d+,-./).(*d+,-./).
(*d+,-./)0"- ip_address_string)
if not matc(: return False
for component in matc(.groups():
if int(component) " # or int(component) $ %&&:
return False
return 'rue
%heDSLthatsimplifiesourlife
)(*d+,-./).(*d+,-./).(*d+,-./).(*d+,-./)0
WhyDSL&'nswered
When working in a particular domain, write your
code in a syntax that fits the domain.
When working with patterns, use RegEx
When working with RDBMS, use SQL
When working in your domain create your own DSL
%hetwotypesofDSLs
External DSL The code is written in an external
file or as a string, which is read and parsed by the
application
%hetwotypesofDSLs
Internal DSL Use features of the language (like
metaclasses) to enable people to write code in
python that resembles the domain syntax
(reating)orms*+oDSL
"form$
"la1el$2ame:"3la1el$"input type=4text4 name=4name43$
"la1el$!mail:"3la1el$"input type=4text4 name=4email43$
"la1el$5ass6ord:"3la1el$"input type=4pass6ord4
name=4name43$
"3form$
(reating)orms*+oDSL
Requires HTML knowledge to maintain
Therefore it is not possible for the end user to
change the structure of the form by themselves
(reating)orms*!ternalDSL
7serForm
name8$9(arField la1el:7sername
email8$!mailField la1el:!mail :ddress
pass6ord8$5ass6ordField
This text file is parsed and rendered by the app
(reating)orms*!ternalDSL
+ Easy to understand form structure
+ Can be easily edited by end users
Requires you to read and parse the file
(reating)orms*,nternalDSL
class 7serForm(forms.Form):
username = forms.;egexField(regex=r<)*6=0<-
max_lengt(=.#)
email = forms.!mailField(maxlengt(=>&)
pass6ord =
forms.9(arField(6idget=forms.5ass6ord?nput())
Django uses metaclass magic to convert this
syntax to an easily manipulated python class
(reating)orms*,nternalDSL
+ Easy to understand form structure
+ Easy to work with the form as it is regular python
+ No need to read and parse the file
Cannot be used by non-programmers
Can sometimes be complicated to implement
Behind the scenes magic debugging hell
(reatingan!ternalDSL
7serForm
name:9(arField 8$ la1el:7sername si@e:%&
email:!mailField 8$ si@e:.%
pass6ord:5ass6ordField
Lets write code to parse and render this form
-ptionsforParsing
Using string functions You have to be crazy
Using regular expressions
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems. - Jamie Zawinski
Writing a parser (we will use PyParsing)
Step."GetPyParsing
pip install pyparsing
Step/"DesigntheGrammar
form ::= form_name ne6line field=
field ::= field_name colon field_type [arro6 property=]
property ::= Aey colon Balue
form_name ::= 6ord
field_name ::= 6ord
field_type ::= 9(arField C !mailField C 5ass6ordField
Aey ::= 6ord
Balue ::= alp(anumeric=
6ord ::= alp(a=
ne6line ::= *n
colon ::= :
arro6 ::= 8$
#uic0+ote
Backus-Naur Form (BNF) is a syntax for
specifying grammers
Step1",mplementtheGrammar
ne6line = "*n"
colon = ":"
arro6 = "8$"
6ord = Dord(alp(as)
Aey = 6ord
Balue = Dord(alp(anums)
field_type = oneEf("9(arField !mailField 5ass6ordField")
field_name = 6ord
form_name = 6ord
field_property = Aey = colon = Balue
field = field_name = colon = field_type =
Eptional(arro6 = EneErFore(field_property)) = ne6line
form = form_name = ne6line = EneErFore(field)
#uic0+ote
PyParsing itself implements a neat little internal
DSL for you to describe the parser grammer
Notice how the PyParsing code almost perfectly
reflects the BNF grammer
-utput
$ print form.parseGtring(input_form)
[<7serForm<- <*n<- <name<- <:<- <9(arField<- <8$<-
<la1el<- <:<- <7sername<- <si@e<- <:<- <%&<- <*n<-
<email<- <:<- <!mailField<- <8$<- <si@e<- <:<- <%&<- <*n<-
<pass6ord<- <:<- <5ass6ordField<- <*n<]
PyParsing has neatly parsed our form input into
tokens. Thats nice, but we can do more.
Step2"Suppressing+oise%o0ens
ne6line = Guppress("*n")
colon = Guppress(":")
arro6 = Guppress("8$")
-utput
$ print form.parseGtring(input_form)
[<7serForm<- <name<- <9(arField<- <la1el<- <7sername<-
<si@e<- <%&<- <email<- <!mailField<- <si@e<- <%&<-
<pass6ord<- <5ass6ordField<]
All the noise tokens are now removed from the
parsed output
Step3"Grouping%o0ens
field_property = Hroup(Aey = colon = Balue)
field = Hroup(field_name = colon = field_type =
Hroup(Eptional(arro6 = EneErFore(field_property))) =
ne6line)
-utput
$ print form.parseGtring(input_form)
[<7serForm<-
[<name<- <9(arField<-
[[<la1el<- <7sername<]- [<si@e<- <%&<]]]-
[<email<- <!mailField<-
[[<si@e<- <%&<]]]-
[<pass6ord<- <5ass6ordField<-[]]]
Related tokens are now grouped together in a list
Step4"Give+amesto%o0ens
form_name = 6ord.set;esults2ame("form_name")
field = Hroup(field_name = colon = field_type =
Hroup(Eptional(arro6 = EneErFore(field_property))) =
ne6line).set;esults2ame("form_field")
-utput
$ parsed_form = form.parseGtring(input_form)
$ print parsed_form.form_name
7serForm
$ print parsed_form.fields[,].field_type
!mailField
Now we can refer to parsed tokens by name
Step5"(onvertPropertiestoDict
def conBert_prop_to_dict(toAens):
prop_dict = +/
for toAen in toAens:
prop_dict[toAen.property_Aey] =
toAen.property_Balue
return prop_dict
field = Hroup(field_name = colon = field_type =
Eptional(arro6 = EneErFore(field_property))
.set5arse:ction(conBert_prop_to_dict) =
ne6line).set;esults2ame("form_field")
-utput
$ print form.parseGtring(input_form)
[<7serForm<-
[<name<- <9(arField<-
+<si@e<: <%&<- <la1el<: <7sername</]-
[<email<- <!mailField<-
+<si@e<: <.%</]-
[<pass6ord<- <5ass6ordField<- +/]
]
Sweet! The field properties are parsed into a dict
Step5"Generate6%7L-utput
We need to walk through the parsed form and
generate a html string out of it
def get_field_(tml(field):
properties = field[%]
la1el = properties["la1el"] if "la1el" in properties else field.field_name
la1el_(tml = ""la1el$" = la1el = ""3la1el$"
attri1utes = +"name":field.field_name/
attri1utes.update(properties)
if field.field_type == "9(arField" or field.field_type == "!mailField":
attri1utes["type"] = "text"
else:
attri1utes["type"] = "pass6ord"
if "la1el" in attri1utes:
del attri1utes["la1el"]
attri1utes_(tml = " ".Ioin([name="=<"=Balue="<" for name-Balue in attri1utes.items()])
field_(tml = ""input " = attri1utes_(tml = "3$"
return la1el_(tml = field_(tml = ""1r3$"
def render(form):
fields_(tml = "".Ioin([get_field_(tml(field) for field in form.fields])
return ""form id=<" = form.form_name.lo6er() ="<$" = fields_(tml = ""3form$"
-utput
$ print render(form.parseGtring(input_form))
"form id=<userform<$
"la1el$7sername"3la1el$
"input type=<text< name=<name< si@e=<%&<3$"1r3$
"la1el$email"3la1el$
"input type=<text< name=<email< si@e=<.%<3$"1r3$
"la1el$pass6ord"3la1el$
"input type=<pass6ord< name=<pass6ord<3$"1r3$
"3form$
,twor0s89ut....
Yuck!
The output rendering code is an UGLY MESS
Wishwecoulddothis...
$ print Form(9(arField(name=4user4-si@e=4%&4-la1el=4?J4)-
id=4myform4)
"form id=<myform<$
"la1el$?J"3la1el$
"input type=<text< name=<name< si@e=<%&<3$"1r3$
"3form$
Neat, clean syntax that matches the output domain
well. But how do we create this kind of syntax?
Letscreatean,nternalDSL
class Ktml!lement(o1Iect):
default_attri1utes = +/
tag = "unAno6n_tag"

def __init__(self- Largs- LLA6args):
self.attri1utes = A6args
self.attri1utes.update(self.default_attri1utes)
self.c(ildren = args
def __str__(self):
attri1ute_(tml = " ".Ioin(["+/=<+/<".format(name- Balue) for name-Balue in
self.attri1utes.items()])
if not self.c(ildren:
return ""+/ +/3$".format(self.tag- attri1ute_(tml)
else:
c(ildren_(tml = "".Ioin([str(c(ild) for c(ild in self.c(ildren])
return ""+/ +/$+/"3+/$".format(self.tag- attri1ute_(tml- c(ildren_(tml-
self.tag)
$ print Ktml!lement(id=4test4)
"unAno6n_tag id=<test<3$
$ print Ktml!lement(Ktml!lement(name=4test4)- id=4id4)
"unAno6n_tag id=<id<$"unAno6n_tag name=<test<3$"3unAno6n_tag$
class ?nput(Ktml!lement):
tag = "input"
def __init__(self- Largs- LLA6args):
Ktml!lement.__init__(self- Largs- LLA6args)
self.la1el = self.attri1utes["la1el"] if "la1el" in self.attri1utes else
self.attri1utes["name"]
if "la1el" in self.attri1utes:
del self.attri1utes["la1el"]
def __str__(self):
la1el_(tml = ""la1el$+/"3la1el$".format(self.la1el)
return la1el_(tml = Ktml!lement.__str__(self) = ""1r3$"
$ print ?nput!lement(name=4username4)
"la1el$username"3la1el$"input name=<username<3$"1r3$
$ print ?nput!lement(name=4username4- la1el=47ser ?J4)
"la1el$7ser ?J"3la1el$"input name=<username<3$"1r3$
class Form(Ktml!lement):
tag = "form"
class 9(arField(?nput):
default_attri1utes = +"type":"text"/
class !mailField(9(arField):
pass
class 5ass6ordField(?nput):
default_attri1utes = +"type":"pass6ord"/
+ow...
$ print Form(9(arField(name=4user4-si@e=4%&4-la1el=4?J4)-
id=4myform4)
"form id=<myform<$
"la1el$?J"3la1el$
"input type=<text< name=<name< si@e=<%&<3$"1r3$
"3form$
Nice!
Step5$evisited"-utput6%7L
def render(form):
field_dict = +"9(arField": 9(arField- "!mailField":
!mailField- "5ass6ordField": 5ass6ordField/
fields = [field_dict[field.field_type]
(name=field.field_name- LLfield[%]) for field in
form.fields]
return Form(Lfields- id=form.form_name.lo6er())
Now our output code uses our Internal DSL!
?257'
7serForm
name:9(arField 8$ la1el:7sername si@e:%&
email:!mailField 8$ si@e:.%
pass6ord:5ass6ordField
E7'57'
"form id=<userform<$
"la1el$7sername"3la1el$
"input type=<text< name=<name< si@e=<%&<3$"1r3$
"la1el$email"3la1el$
"input type=<text< name=<email< si@e=<.%<3$"1r3$
"la1el$pass6ord"3la1el$
"input type=<pass6ord< name=<pass6ord<3$"1r3$
"3form$
Getthewholecode
(ttp:331it.ly3pyconindia_dsl
Summary
+ DSLs make your code easier to read
+ DSLs make your code easier to write
+ DSLs make it easy to for non-programmers to
maintain code
+ PyParsing makes is easy to write External DSLs
+ Python makes it easy to write Internal DSLs

You might also like