You are on page 1of 17

ZoloPages

© 2008 Alfred Zolo


I ZoloPages

Table of Contents
Foreword 0

Part I Introduction 2

Part II ZoloPages 3
1 ZoloPages Extractor
...................................................................................................................................
(ZPG) 3
Extractor - Surf..........................................................................................................................................................
and Collect 4
Extractor - Select
..........................................................................................................................................................
and Save 5
Settings .......................................................................................................................................................... 8

Part III ZoloMask Creator (ZMC) 10


1 Base Markers
................................................................................................................................... 14
2 Wild Cards ................................................................................................................................... 15

Index 0

© 2008 Alfred Zolo


Introduction 2

1 Introduction

ZoloPages: From Web Page to Database in One Click!

ZoloPages is a Name extractor, Address extractor, Phone extractor, Fax extractor, URL extractor,
Email extractor for any web page with deployed data. Fully customizable, it enables you to develop
your own ZoloMasks to carry out tedious data mining tasks, such as (but not limited to) retrieving data
from the white pages or pink, yellow, green pages in almost any country in the world today. You can
grab data from the web in no time! ZoloPages can then save the data you've selected in Microsoft
Excel or Microsoft Outlook format for future use.

Visit http://www.zolopages.com for more recent versions of this software and manual/help file.

ZoloPages is a suite with two main applications :

ZoloPages Extractor 3 - allows the capture of data from online services. The extractor is supplied
with no guarantee implied, and no templates except the one in the picture above. If you're looking for
specific ZPG templates fr your own country, you will have to resort to third party web sites (including,
but not limited to: http://www.zolomask.com in Asia).

ZoloMask Editor 10 - allows the edition and creation of filters specific to some determined web
services

© 2008 Alfred Zolo


3 ZoloPages

2 ZoloPages
2.1 ZoloPages Extractor (ZPG)

The Extractor application features two main screens, or tabs:

1. Surf and Collect 4

2. Select and Save 5

Visit http://www.zolopages.com for more recent versions of this software and manual/help file.

© 2008 Alfred Zolo


ZoloPages 4

2.1.1 Extractor - Surf and Collect

Click on FILE then OPEN in the top menu.

Select a web source. The corresponding web page will be displayed in the integrated browser.

© 2008 Alfred Zolo


5 ZoloPages

Check AUTOMODE if you wish to capture more than one page automatically.

Then click on the CAPTURE button when you are satisfied with the data displayed on the page.

ZoloPages is now ready to capture data. It will save that data in the second tab:
Select and Save 5 .

2.1.2 Extractor - Select and Save

© 2008 Alfred Zolo


ZoloPages 6

This page holds the data captured from the web page, stored in a regular grid. Please not (below) that
no data has been selected at this point. No checkbox is actually selected.

Select the data you wish to save to disk: either individually, or several at a time by right-clicking on the
SELECT button.

© 2008 Alfred Zolo


7 ZoloPages

The following menu appears:

With these menu items you can thus SELECT, UNSELECT all records. You can also
inverse your selection and even delete part or all of it. e

Save the data you've selected by right-clicking on the SAVE button.

The following menu appears:

With these menu items you can save to various common formats, including XML, HTML,
and even ZoloPages proprietory ZPD.

The most common formats are obviously MS Outlook (individual Contacts entries), CSV
(text-based) for Excel and Access, Excel itself, and Word. e

Click on the item of your choice. The following dialog will pop up.

© 2008 Alfred Zolo


ZoloPages 8

Give the new file the name of your choice and save it where you see fit. That's it!

2.1.3 Settings

This part deals only wit the way the data will be saved in MS Outlook.

© 2008 Alfred Zolo


9 ZoloPages

1. NAME FORMAT

· Keep order the same: ANDREW FINNEGAN will be stored as ANDREW FINNEGAN in Outlook
Contacts.
· Invert Name+First Name: BAILLY PAUL will be stored as PAUL BAILLY in Outlook Contacts.

2. ADDRESS DISPLAY

· Full Address including city:

Industrial Area # 11, Sharjah


P.O.Box : 63120, Sharjah

· Street Address only:

Industrial Area # 11, Sharjah

The rest of the address: 63120, Sharjah will still be stored in the Zip Code and City fields.

3. CONTACT TYPE

A personal contact will be saved with the HOME PHONE and HOME ADDRESS selections, whereas...
A business contact will be saved with the BUSINESS PHONE and BUSINESS ADDRESS selections.

4. CONFIRMATION

An Office/Outlook confirmation screen will pop up so that you can verify the information entered by
ZoloPages in your Contacts, and make sure the Name format, Address Display and Contact Type
are correct.

WARNING: This is not ideal when you save multiple contacts, as many contact screens will pop up
simultaneously. Reserve this for single saving or for double checking the format of one record
before saving all the others!

© 2008 Alfred Zolo


ZoloPages 10

WARNING: You must think carefully before saving multiple contacts to Outlook!

3 ZoloMask Creator (ZMC)

ZoloMask Editor is a tool to edit and create filters usable by ZoloPages .

What you need to do is work with the HTML code for each page/data template, and isolate the
elements that constitute a data entry on any given page.

Creating a new ZoloMask - Step by step process:

1.Navigate to a any web page you'll need to capture data from (first tab in left pane). Enter URL and
press ENTER.

2.Click on the "Load HTML from Browser" button

© 2008 Alfred Zolo


11 ZoloPages

What you now see below the browser is roughly this:

The HTML code for the current page is displayed. If there is too much HTML data, simply isolate one
address, and press LOAD HTML again. Only the browser selection will appear.

2. Isolate the boundary elements for one entry. Use the FIND button (CTRL+F) if necessary, as
some HTML pages are sometimes not very legible.

You can select text from the HTML View and drop it into the various fields below. If you press CTRL
while dragging, your text will not replace the current one. On the contary it will be added to it and the
marker "|" (or) will be inserted between the two.

© 2008 Alfred Zolo


ZoloMask Creator (ZMC) 12

Then click on the TEST CURRENT MASK button.

Then go to the ONE ENTRY pane on the left. One entry only should be displayed there, as opposed to
the whole HTML code in the WHOLE DOCUMENT pane.

© 2008 Alfred Zolo


13 ZoloPages

Then go to the second pane, called TEST FILTERS. Verify that your base markers are all correct
before proceeding to the phone, fax, http and email field definitions.

Then SAVE your ZOLOMASK by clicking on the rightmost button (the disk) in the first toolbar.

Give it a specific name. The extension .ZPG will be added to the filename.
Try to follow the standards edicted by ZoloMask:

1.two-letter-country-code
2.followed by a dash
3.full descriptive name

Example:
IN-PagesRojes: Pages Rojes service of India (doesn't exist, of course)

Visit http://www.zolopages.com for more recent versions of this software and manual/help file.

© 2008 Alfred Zolo


ZoloMask Creator (ZMC) 14

3.1 Base Markers

ZoloMask uses some basic hierarchical markers to isolate data from a web page.

<Add_Start>
<Name_Start>
Jonas Doe
<Name_End>
<Street_Start>
Villavägen 27, 27525 Öreby
// Also contains determiners for CITY and ZIP CODE
<Street_End>
// Other data: phone, fax, tollfree, http and email
<Add_End>

In your edition grid, these base markers can be defined as shown below.

© 2008 Alfred Zolo


15 ZoloPages

3.2 Wild Cards

A number of wild cards are available to help you work with variable elements in the HTML
code.

1. §
--> replaces a maximum of 25 characters (from 1 up to 25), including line breaks. Useful when
table cells have various colors from one line to the next (but not only).

Example:

<table bgcolor="§" >


corresponds to both
<table bgcolor="000000" >
and
<table bgcolor="FFFFFF" >

2. ~
--> goes back 30 characters, instead of capturing 30 characters after the expression that
follows.

Examples:

fax: §
captures
fax: (215)1247847879

~(fax)
captures
(215)1247847879 (fax)

3.? = any one character


--> indicates a hypothetical character or figure

Example:

British Zip Code

NW?11?N? 1N?N?

© 2008 Alfred Zolo


ZoloMask Creator (ZMC) 16

captures
E17 3HX
SW4 7AA
SW18 4DW
EC1A 9LH
etc.

|
4. = OR
--> placed between two alternate expressions looks for both in the HTML code

Example:

<table width="365">|</tr></td>

corresponds to both

<table width="365">
and/or
</tr></td>

5. °
--> replaces a maximum of 5 characters (from 1 up to 5), including line breaks. Useful when
table cells have various colors from one line to the next (but not only).

Example:

<table width="°">

corresponds to both

<table width="250"> and


<table width="50%"> and
<table width="125">, etc.

These wild cards can of course be combined in individual expressions.

© 2008 Alfred Zolo

You might also like