Professional Documents
Culture Documents
Under the sincere supervision of: Mr. Mukesh Sahu Mrs. Mamta Rani
Submitted by Anjani Kumar Singh (11413202809) Kanwaljeet Singh (09213202809) Lovneesh Tanwar (09913202809) Nikhil Aggarwal (13213202809) In partial fulfillment of the requirement for the award of the degree of
Bachelor of Technology
In
Batch:2009-2013
Guru Tegh Bahadur Institute of Technology ANKL Group (ECE)
ACKNOWLEGDEMENT
We take this opportunity to express our deepest gratitude towards Mr. Mukesh Sahu and Ms. Mamta Rani, our project guides, who has been the driving force behind this project and whose guidance and co-operation has been a source of inspiration for us. We would also like to thank Mr Amrish Maggo and Ms. Parul Dawar for their valuable support whenever needed. We are very much thankful to our professors, colleagues and authors of various publications to which we have been referring to. We express our sincere appreciation and thanks to all those who have guided us directly or indirectly in our project. Also much needed moral support and encouragement was provided on numerous occasions by our whole division. Finally we thank our parents for their immense support.
Anjani Kumar
Nikhil Aggarwal
Kanwaljeet Singh
Lovneesh Tanwar
DECLARATION
This is to certify that the report entitled Design of Voice Controlled Robot with Gripping Mechanism using AVR which is submitted by us in partial fulfillment of the requirement for the award of the degree of Bachelor of Technology in Electronics and Communication Engineering from Guru Tegh Bahadur Institute of Technology, New Delhi is comprising of our original work and due acknowlegdement has been made in text to all my other materials used under the supervision of our project guides.
Anjani Kumar
Nikhil Aggarwal
Kanwaljeet Singh
Lovneesh Tanwar
CERTIFICATION
This is to certify that the Minor Project Report entitled Design of Voice Controlled Robot with Gripping Mechanism using AVR submitted in partial fulfillment of the requirement for the degree of Bachelor of Technology in Electronics and Communication Engineering is a bona fide research work carried out by Anjani kumar, Nikhil Aggarwal, Kanwaljeet Singh & Lovneesh Tanwar, students of the Guru Tegh Bahadur Institute of Technology affiliated to Guru Gobind Singh Indraprastha University, Delhi under our supervision.This has not been submitted to any other University or Institution for the award of any degree/diploma/certificate.
ABSTRACT
This project enlightens the research and implementation of voice automated mobile robot. The robot is controlled through connected speech input. The language input allows a user to interact with the robot which is familiar to most of the people. The advantages of speech activated robots are hands-free and fast data input operations. In future, it is expected that speech recognition systems will be used as man-machine interface for robots in rehabilitation, entertainment etc. In view of this, aforementioned system is a source of learning process for a mobile robot which takes speech input as commands and performs some navigation task through a distinct man machine interaction with the application of the learning. The speech recognition system is trained in such a way that it recognizes defined commands and the designed robot navigates based on the instruction through the Speech Commands. The medium of interaction between humans and computers is on the processing of speech (words uttered by the person). The complete system consists of three subsystems, the speech recognition system, a central controller and the robot. We have studied the various factors such as noise which interferes during speech recognition and distance factor. The results prove that proposed robot is capable of understanding the meaning of speech commands.
CONTENTS
1. INTRODUCTION
1.1. Overview 1.2. Voice Recognition System 1.3. Modules of Project
1-9
10-17
3. MICROCONTROLLER MODULE
3.1. Pin Description
18-20 21-23
4. TRAINER MODULES
4.1. Keypad Matrix 4.2. Seven Segment Display Module
24-25
6. OTHER MODULES
6.1. Power Supply Module
8. SOFTWARE DEVELOPMENT
8.1. Using AVR Studio 4 8.2. Flowchart
36-37 38 39 40-66
10. APPLICATIONS
Appilications and Future aspects
13. REFERENCES
67-68
LIST OF FIGURES
Title
1.1 Overview 1.2 Speech Recognition Module 1.3 Keypad Matrix Module 1.4 Display Module 1.5 Motor Driver Module 1.6 Power Supply Module 1.7 Microcontroller Module 2.1 HM2007 Module 2.2 Schematic of Speech Recognition Module 2.3 Pin Discription of HM2007 IC 2.4 Keypad 2.5 HY6264 IC 2.6 Working of HM2007 3.1 Microcontroller Module 4.1 Keypad Matrix Circuit 4.2 Seven Segment Display Module 4.3 Logic Diagram of 74LS373 5.1 Motor Driver Module 5.2 Motor Driver Circuit 6.1 Power Supply 6.2 Gripper 8.1 Create a New Project 8.2 Specifing Project Name and Path 8.3 Select Debug Platform & Microcontroller 8.4 Configure Project Setting 8.5 Flow Chart
Page No
1 5 6 7 8 8 9 10 11 12 14 16 17 18 21 22 23 24 25 26 27 30 31 31 32 36
CHAPTER 1: INTRODUCTION
1.1 OVERVIEW
The purpose of this project is to build a robotic car which could be controlled using voice commands. Generally these kinds of systems are known as Speech Controlled Automation Systems (SCAS). Our system will be a prototype of the same. We are not aiming to build a robot which can recognize a lot of words. Our basic idea is to develop some sort of menu driven control for our robot, where the menu is going to be voice driven. What we are aiming at is to control the robot using following voice commands.
Figure 1.1
Robot which can do these basic tasks:1. Move forward 2. Move back 3. Turn right 4. Turn left 5. Load 6. Release 7. Stop
1. A microphone picks up the signal of the speech to be recognized and converts it into an electrical signal. A modern speech recognition system also requires that the electrical signal be represented digitally by means of an analog-to-digital (A/D) conversion process, so that it can be processed with a digital computer or a microprocessor. 2. This speech signal is then analyzed (in the analysis block) to produce a representation consisting of salient features of the speech. The most prevalent feature of speech is derived from its short-time spectrum, measured successively over short-time windows of length 2030 milliseconds overlapping at intervals of 1020 ms. Each short-time spectrum is transformed into a feature vector, and the temporal sequence of such feature vectors thus forms a speech pattern. 3. The speech pattern is then compared to a store of phoneme patterns or models through a dynamic programming process in order to generate a hypothesis (or a number of hypotheses) of the phonemic unit sequence. (A phoneme is a basic unit of speech and a phoneme model is a succinct representation of the signal that corresponds to a phoneme, usually embedded in an utterance.) A speech signal inherently has substantial variations along many dimensions. Before we understand the design of the project let us first understand speech recognition types and styles. Speech recognition is classified into two categories, speaker dependent and speaker independent.
Speaker dependent: This system is trained by the individual who will be using the system. These systems are capable of achieving a high command count and better than 95% accuracy for word recognition. The drawback to this approach is that the system only responds accurately only to the individual who trained the system. This is the most common approach employed in software for personal computers. Speaker independent: This is a system trained to respond to a word regardless of who speaks. Therefore the system must respond to a large variety of speech patterns, inflections and enunciation's of the target word. The command word count is usually lower than the speaker dependent however high accuracy can still be maintain within processing limits. Industrial requirements more often need speaker independent voice systems, such as the AT&T system used in the telephone systems. A more general form of voice recognition is available through feature analysis and this technique usually leads to "speaker-independent" voice recognition. Instead of trying to find an exact or nearexact match between the actual voice input and a previously stored voice template, this
Guru Tegh Bahadur Institute of Technology ANKL Group (ECE) 3
method first processes the voice input using "Fourier transforms" or "linear predictive coding (LPC)", then attempts to find characteristic similarities between the expected inputs and the actual digitized voice input. These similarities will be present for a wide range of speakers, and so the system need not be trained by each new user. The types of speech differences that the speaker-independent method can deal with, but which pattern matching would fail to handle, include accents, and varying speed of delivery, pitch, volume, and inflection. Speaker-independent speech recognition has proven to be very difficult, with some of the greatest hurdles being the variety of accents and inflections used by speakers of different nationalities. Recognition accuracy for speaker independent systems is somewhat less than for speaker-dependent systems, usually between 90 and 95 percent. Speaker independent systems do not ask to train the system as an advantage, but perform with lower quality. Recognition Style:
Speech recognition systems have another constraint concerning the style of speech they can recognize. They are three styles of speech: isolated, connected and continuous.
Isolated speech recognition systems can just handle words that are spoken separately. This is the most
common speech recognition systems available today. The user must pause between each word or command spoken. The speech recognition circuit is set up to identify isolated words of .96 second lengths.
Connected is a half-way point between isolated word and continuous speech recognition. Allows users
to speak multiple words. The HM2007 can be set up to identify words or phrases 1.92 seconds in length. This reduces the word recognition vocabulary number to 20.
Continuous is the natural conversational speech we are used to in everyday life. It is extremely difficult
for a recognizer to shift through the text as the words tend to merge together. For instance, "Hi, how are you doing?" sounds like "Hi,.howyadoin" Continuous speech recognition systems are on the market and are under continual development.
1.3 MODULES OF PROJECT 1. VOICE RECOGNITION MODULE: This module basically consists of HM2007L, SRAM (8K*8),
MIC input and a Keypad(4*3). It is the heart of the entire system. HM2007 is a voice recognition chip with on-chip analog front end, voice analysis, recognition process and system control functions. The input voice command is analyzed, processed, recognized and then obtained at one of its output port of microcontroller which is then decoded , amplified and given to motors of robot car. SRAM is basically used to store the commands for future comparison. Keypad is used to configure HM2007 by telling at which number a command(word) to be stored in RAM.
2. DISPLAY MODULE: This module is made to display numbers corresponding to commands stored
for specific tasks to be done by microcontroller. It consists of 2 seven segment which is used as display. Guru Tegh Bahadur Institute of Technology ANKL Group (ECE) 7
There is a latch(74LS373) used to provide data out to seven segment as well as to microcontroller at same time. It also consists of two BCD to Binary converter.
3. MOTOR DRIVER MODULE: Basic purpose of this module is to drive the motors means providing
12 volt to motors. Guru Tegh Bahadur Institute of Technology ANKL Group (ECE)
4. POWER SUPPLY MODULE: This module is source of energy of our project. Its purpose is to
provide 5vol and 12 volt power supply.
5. MICROCONTROLLER MODULE: This module is used to controller the robot according to the
logic stored in form of code. In this we are using Atmega 16 as microcontroller. Guru Tegh Bahadur Institute of Technology ANKL Group (ECE) 9
10
In the above figure, the HM2007 Module is shown. Refer the numbering on the components of the block from below. 1. 3V BATTERY CELL 2. RAM 3. HM2007 VOICE RECOGNITION IC 4. CRYSTAL OSCILLATOR 5. INPUT PINS FOR ANALOG SIGMAL (HERE FROM MIC) 6. LED (TO SHOW THE CURRENT STATUS OF HM2007 IC) 7. BERG STRIPS FOR SENDING 8-BIT OUTPUT SIGNALS TO THE SEVEN SEGMENT AND MICROCONTROLLER CIRCUIT BOARD. 8. BERG STRIPS FOR ATTACHING THE KEYPAD METRIX TO TRAIN HM2007 9. GND
11
12
13
In our project we have used the 52 52-pin PLCC package as shown in the diagram above. Pin numbering starts from the dot shown and moves towards the anti-clockwise direction. anti
2.3 SPECIFICATION
INPUT VOLTAGE 9 -15 V DC. 15 OUTPUT VOLTAGE 8-bits at 5V logic bits INTERFACE Any microcontroller like 8051, PIC or AVR can be interfaced to data port to interpret.
B) CPU mode. The CPU mode provides functions like RECOG, TRAIN, RESULT, UPLOAD, DOWNLOAD, RESET etc. In this mode, the K K-bus is used as a bidirectional data bus between the external ectional controller and HM2007 and S1 to S3 as the R/W control pins. In our project we have used the Manual mode for training and other purposes. Guru Tegh Bahadur Institute of Technology ANKL Group (ECE) 14
A) Power ON
When the power is ON HM2007 will start its initialization process. If wait pin is LOW, HM2007 will do the memory check to see whether the external 8K byte SRAM is perfect or not. If WAIT pin is HIGH, HM2007 will skip the memory check process. After the initial process is done, HM2007 will then move into recognition mode.
B) RECOGNITION MODE
1. WAIT pin is HIGH: In this mode, the RDY is set to LOW and HM2007 is ready to accept the voice input to be recognized. When the voice input is detected, the RDY will return to HIGH and HM2007 begin its recognition process. It is recommended that user train the word pattern before the beginning of the recognition operation, otherwise the result will be unpredictable. After the recognition process is completely, the result will appear on the D-Bus with the DEN active.
2. WAIT pin is LOW: In this mode, no voice input is accepted until WAIT pin back to HIGH state.
15
When the number is entered, function key is pressed to choose the particular operation. TRAIN (#) CLEAR (*)
If the function CLR(*) is pressed, the corresponding word pattern will be cleared and then (*) HM2007 will return its recognition mode. If the function key TRN(#) is pressed, HM2007 will TRN send a low level signal to RDY to indicate that HM2007 is ready to accept voice input. If wait pin is LOW, no voice input will be detected until WAIT pin back to HIGH. After available voice input to HM2007 will return to its recognition mode and send a low level signal to RDY to indicate that HM2007 is ready for voice input to do the recognition process. Example, 24 TRN training the 24th pattern. 1326 TRN training the last two digits i.e. 26th pattern.
E) OUR TRAINED COMMANDS i. ii. iii. iv. v. 01 for NORTH 02 for SOUTH 03 for EAST 04 for WEST 05 for WAIT
16
vi. vii.
F) ERROR CODES The chip provides the following error codes. 55 for WORD TOO LONG 66 for WORD TOO SHORT 77 for NO MATCH FOUND
PIN NAMES A0-A12 I/O 0 I/O 7 CS1 CS2 ADDRESS INPUT DATA INPUT/ OUTPUT CHIP SELECT ONE CHIP SELECT TWO
17
WE OE
CC
GND
SRAM is using 3V supply for its operation. SRAM is sharing its 12 bit address bus with the HM2007s 12 bit address bus. This is because at the time of operation the HM2007 checks the RAM for the words stored corresponding to the input voice signal. Without SRAM the HM2007 will not store any commands for them after training the HM2007 IC. SRAM has 8 bit output (D0 D7)which is connected to 8 bit output of HM2007 and the latch IC of seven segment module.
8-bit signal
HM2007
7 segment module
8-bit data lines
Microcontroller module
19
The microcontroller gets is input from the 74LS373 latch IC which contains the BCD output of the microprocessor, this BCD output is generated when a match is found by the HM2007 during recognition mode or trainer mode. The microcontroller is trained such that , some action is performed for a specific word spoken at the HM2007 module, here we perform functions on the robot to move itself forward,
backward, left or right and close or open the gripper attached on top of the robot. The PORT B is used to read the BCD code and PORT D is used as output port which is further interfaced with the motor driver unit. The software used to write the code is AVR STUDIO 4. 18 in embedded C and KHAZMA is used to burn the .hex file on the microcontroller through an ISP connector. In-System Programming allows programming and reprogramming of any AVR microcontroller positioned inside the end system. Using a simple Three-wire SPI interface, the In-System Programmer communicates serially with the AVR microcontroller, reprogramming all non-volatile memories on the chip. In-System Programming eliminates the physical removal of chips from the system. This saves time, and money, both during development in the lab and when updating the software or parameters in the field.
20
VCC Digital supply voltage. GND Ground. Port A (PA7..PA0) Port A serves as the analog inputs to the A/D Converter. Port A also serves as an 8bit bi-directional I/O port, if the A/D Converter is not used. Port pins can provide internal pull-up resistors (selected for each bit). The Port A output buffers have symmetrical drive characteristics with both high sink and source capability. When pins PA0 to PA7 are used as inputs and are externally pulled low, they will source current if the internal pull-up resistors are activated. The Port A pins are tri-stated when a reset condition becomes active, even if the clock is not running. Port B (PB7..PB0) Port B is an 8-bit bi-directional I/O port with internal pull-up resistors (selected for each bit). The Port B output buffers have symmetrical drive characteristics with both high sink and source capability. As inputs, Port B pins that are externally pulled low will source current if the pull-up resistors are activated. The Port B pins are tri-stated when a reset condition becomes active, even if the clock is not running. Port C (PC7..PC0) Port C is an 8-bit bi-directional I/O port with internal pull-up resistors (selected for each bit). The Port C output buffers have symmetrical drive characteristics with both high sink and source capability. As inputs, Port C pins that are externally pulled low will source current if the pull-up resistors are activated. The Port C pins are tri-stated when a reset condition becomes active, even if the clock is not running. If the JTAG interface is enabled, the pull-up resistors on pins PC5(TDI), PC3(TMS) and PC2(TCK) will be activated even if a reset occurs. Port D (PD7..PD0) Port D is an 8-bit bi-directional I/O port with internal pull-up resistors (selected for each bit). The Port D output buffers have symmetrical drive characteristics with both high sink and source capability. As inputs, Port D pins that are externally pulled low will source current if the pull-up resistors are activated. The Port D pins are tri-stated when a reset condition becomes active, even if the clock is not running. RESET Input. A low level on this pin for longer than the minimum pulse length will generate a reset, even if the clock is not running. Shorter pulses are not guaranteed to generate a reset. XTAL1 Input to the inverting Oscillator amplifier and input to the internal clock operating circuit. XTAL2 Output from the inverting Oscillator amplifier. Guru Tegh Bahadur Institute of Technology ANKL Group (ECE)
21
AVCC is the supply voltage pin for Port A and the A/D Converter. It should be externally connected to VCC, even if the ADC is not used. If the ADC is used, it should be connected to VCC through a low-pass filter. AREF is the analog reference pin for the A/D Converter.
column2 will be low. So we come to know that key 2 of Row1 is pressed. This is how scanning of a keyboard matrix is done by the HM2007 during training and recognition.
22
So to scan the keypad completely, we need to make rows low one by one and read the columns. If any of the buttons is pressed in a row, it will take the corresponding column to a low state which tells us that a key is pressed in that row. If button 1 of a row is pressed then Column 1 will become low, if button 2 then column2 and so on.
The keypad and digital display are used to communicate with and program the HM2007 chip. The keypad is made up of 12 normally open momentary contact switches. When the circuit is turned on, 00 is on the digital display, the red LED (READY) is lit and the circuit waits for a command. Training Words for Recognition, Press 1 (display will show 01 and the LED will turn off) on the
keypad, then press the TRAIN key( the LED will turn on) to place circuit in training mode, for word one. Say the target word into the on board microphone (near LED) clearly. The circuit signals acceptance of the voice input by blinking the LED off then on. The word (or utterance) is now identified as the 01 word. If the LED did not flash, start over by pressing 1 and then TRAIN key. You may continue training new words in the circuit. Press 2 then TRN to train the second word and so on. The circuit will accept and recognize up to 20 words (numbers 1 through 20). It is not necessary to train all word spaces. If you only require 10 target words that all you need to train.
23
In recognition mode, the chip compares the user fed analog signal from the microphone with those stored in the SRAM and if it recognizes a command, an output of the command identifier will be sent to the microprocessor through the D0 to D7 ports of the chip which gives the BCD output. For training, testing (if recognized properly) and clearing the memory, keypad and 7-segment display is used.
The eight latches of the 74LS373 are transparent D type latches meaning that while the enable (G) is high the Q outputs will follow the data (D) inputs. When the enable is taken low the output will be latched at the level of the data that was set up. The BCD output of the microprocessor is latched by this IC which is taken as input by the microcontroller Atmega16 for performing various functions according to the voice commands recognized , further converted by the CD4511B to drive the 7segment display,
24
CD4511B are BCD-to-7-segment latch decoder drivers constructed with CMOS logic and n-p-n bipolar transistor output devices on a single monolithic structure. This is used to drive the 7 segment display according to HM2007 during training and recognition. The following are the errors displayed on the seven segments during training and recognition 55 = word to long 66 = word to short 77 = no match
25
connected with power supply the shaft rotates. You can reverse the direction of rotation by reversing the polarity of input.
This chip is designed to control 2 DC motors. There are 2 INPUT and 2 OUTPUT PINs for each motor. The connections are as follows
26
The behavior of motor for various input conditions are as follow A Stop Clockwise Anti-Clockwise Stop LOW LOW HIGH HIGH B LOW HIGH LOW HIGH
We just need to set appropriate levels at two PINs of the microcontroller to control the motor. Since this chip controls two DC motors there are two more output pins (output3 and output4 and two more input pins input3 and input4). The INPUT3 and INPUT4 controls second motor in the same way as listed above for input A and B. There are also two ENABLE pins they must be high (+ 5v) for operation, if the y are pul led low (GND) motors will stop.
27
Function of this block is to distribute the power supply to all the other modules for their proper working. We have two major power supply circuits. 1. 12 V power supply circuit for motor driver module, to drive motors. 2. 5 V power supply circuit for modules such as HM2007 module, microcontroller module, motor driver module, seven segment module.
6.2 FUNCTIONING
As we want final output of maximum 12V, so we need a DC source with minimum voltage rating of 13V. For 5V power supply we need minimum of 6V input voltage. As seen in the block diagram above (going from left to right), 2nd component after input voltage (point 1), is capacitor which is used to refine the DC signal and stop the AC components if present. 3rd component is the voltage regulator IC. It is 7805 in case of 5V power supply circuit and 7812 in case of 12V power supply circuit. More about these ICs can be seen in the datasheet given in the appendix. 7805 and 7812 drops the input voltage to 5V and 12V respectively. Components 4 and 5 are just used to denote the state of power supply circuit, whether it is Guru Tegh Bahadur Institute of Technology ANKL Group (ECE) 28
ON or OFF. Going further, components 6 and 7 are also used to still refine the final output voltage so that we get the purest form of DC voltage. IMPORTANT NOTE Since we have created all modules on different PCBs, each module having its own independent circuitry, we have to connect the ground pin of all the modules to same position or with each other. This situation is called as COMMON GROUND condition where all the ground pins are shorted with each other for proper operation of the circuit. So our main aim is to through the ground/negative pin of the power supply circuit with all other modules.
Under power on mode the gripper shall continue to grap the object, on power off it shall try to release load. The recommended voltage for operation of the motors of the gripper is 6V.
29
30
31
Figure 8.1
32
Figure 8.2
Figure 8.3
33
Figure 8.4
34
35
Having understood this, lets go ahead and write a subroutine which will initialize the I/O port directions. void port_init(void) { DDRA = 0xFF; PORTA = 0x00; DDRB = 0xF7; PORTB = 0x7F; DDRC = 0xFF; PORTC = 0x00; DDRD = 0x00; PORTD = 0xFF; DDRE = 0x7C; PORTE = 0xFF; DDRF = 0x00; PORTF = 0x00; DDRG = 0x00; PORTG = 0x1F; }
36
8.2 FLOWCHART
Figure 8.5
37
38
PHASE 5: SOFTWARE DEVELOPMENT PHASE This was completely a different phase from all the previous ones. The AVR platform was new to us and we had to learn everything from the scratch to understand the basic concepts and then start programming. In programming main problem we got was with the clock cycle of the microcontroller. Some other problems that we got were related to delay function and header file. PHASE 6: SOFTWARE TESTING AND DEBUGGING PHASE In this phase we had to burn the code onto the microcontroller. First problem here was to find a suitable burner kit to burn the software onto the ATMega16 IC. Testing include checking the connections according to programming we had done. Main problem we came across was while we have to interchange the connections according to our programming.
39
The robot is useful in places where humans find difficult to reach but human voice reaches. E.g. in a small pipeline, in a fire-situations, in highly toxic areas. The robot can be used as a toy. It can be used to bring and place small objects. It is the one of the important stage of Humanoid robots. Command and control of appliances and equipment Telephone assistance systems Data entry Speech and voice recognition security systems
40
CONCLUSION
The voice recognition software has accuracy around 70% to 80% in correctly identifying a voice command. But it is highly sensitive to the surrounding noises. There is a possibility of misinterpreting some noises as one of the voice commands given to the robot. Also the accuracy of word recognition reduces in face of the noise. The sound coming from motors has a significant effect on accuracy. There are some drawbacks in the mobile platform. The rechargeable 6V batteries carried onboard makes it too heavy. Hence we had to use powerful motors to drive the robot making the power consumption higher. So we had to recharge the batteries quite often. The mobile platform had some problems in turning due to the heaviness of itself. The Gripper also face problem in gripping due to low power so we need high power to drive it. The back freewheels used to get stuck when turning especially in reverse motion. Hence we suggest that steering mechanism will be a better option.
41
APPENDICES
42
43
APPENDIX II SCHEMATICS
44
2. MICROCONTROLLER MODULE
45
3. DISPLAY MODULES
46
4. KEYPAD MODULE
47
6. POWER SUPPLY
48
APPENDIX III
Guru Tegh Bahadur Institute of Technology ANKL Group (ECE) 49
50
void delay_ms(int d) { _delay_ms(d); } int main(void) { unsigned char i; DDRB = 0x00; DDRC = 0xFF; DDRD = 0xFF;
while(1) { i = PINB; if(i==16) { NORTH(); WAIT(); } else if(i==32) { SOUTH(); WAIT(); } else if(i==48) { EAST(); WAIT(); } else if(i==64) { WEST(); WAIT(); } else if(i==80) { WAIT(); } Guru Tegh Bahadur Institute of Technology ANKL Group (ECE)
52
53
APPENDIX V DATASHEETS
54
1. HM2007
55
56
57
58
2. ATMEGA 16 L
59
60
61
62
3. HY6264A
63
4. 74LS373
64
65
66
67
5. CD4511B
68
6. L293D
69
70
71
72
REFERENCES
[1] Raj Kamal, Embedded System: Architecture, Programming and Design, Tata McGraw-Hill Education, 01-Jul-2003 [2] THOMAS, T. Voice processing: why speech recognizers make mistakes. Systems International (UK). 15, 10 (1987) [3] VAN PEURSEM, R.S. Speech recognition for quality systems. Quality. 26, 11 (1987), 48-49 [4] FIORE, A. Pros and cons of voice systems. Computerworld. 22, 20 (1988), 79 [5] Geoff Bristow, Electronic speech recognition: techniques, technology, and applications, McGraw-Hill, Inc., New York, NY, 1986 [6] George Philip , Elizabeth S. Young, Manmachine interaction by voice: developments in speech technology. Part 2: general applications and potential applications in libraries and information services, Journal of Information Science, v.13 n.1, p.15-23, Feb. 1987 [7] Grant, P.M., Speech recognition techniques, IEEE Electronics & Communication Engineering Journal, Feb 1991 [8] Speech Recognition: The Complete Practical Reference Guide; T. Schalk, P. J. Foster: Telecom Library Inc, New York; ISBN O-9366648-39-2 [9] Richard L. Klevans, Robert D. Rodman, 'Voice recognition', Artech House, 1997 [10] HM2007 (2006) 'HM2007 Voice Recognition IC', ITP Sensor Workshop, Vol. 2008. [11] Steven Frank Barrett, Daniel J. Pack, 'Atmel AVR Microcontroller Primer: Programming and Interfacing', Morgan & Claypool Publishers, 2008 [12] Muhammad Ali Mazidi, Janice Mazidi, Sarmad Naimi, Sepehr Naimi, 'Avr Microcontroller and Embedded Systems: Using Assembly and C', Prentice Hall, 2010 [13] Richard H. Barnett, Larry D. O'Cull, Sarah A. Cox, Sarah Alison Cox, 'Embedded C Programming And the Atmel AVR', Cengage Learning, 2007 [14] Muhammad Rashid, Power Electronics Handbook: Devices, Circuits, and Applications, Elsevier, 09-Dec2010 - Technology & Engineering [15] Robert Daniel Twomey, Seven segment display, University of California, San Diego, 2007 [16] ROBOT" Merriam-Webster 2010[Online]. Available: webster.com/dictionary/robot Online Dictionary. http://www.merriam-
[17] Robot American Heritage Dictionary. 2010 [Online]. Available:http://education.yahoo.com/reference/dictionar y/entry/robot [18] Bright Hub 2010. [Online]. Available: http://www.brighthub.com/engineering/mechanical/articl es/7 1937.aspx [19] Joris de Ruiter, Natural Language Interaction - the understanding computer, 2010 [Online], Available:www.few.vu.nl/~jdruiter/published_work/Nat ural_language_interaction.pdf [20] Modern Digital Electronics RP Jain, 3rd edition; Tata Mcgraw Hill; Chapter 6&10. For A/D converter and 7 segment display connections. [21] Sunpreet Kaur Nanda, Akshay P.Dhande, Microcontroller Implementation of a Voice Command Recognition System for Human Machine Interface in Embedded System, International Journal of Electronics, Communication & Soft Computing Science and Engineering (IJECSCSE), Volume 1, Issue 1 [22] H. Dudley, The Vocoder, Bell Labs Record, Vol. 17, pp. 122-126, 1939. [23] H. Dudley, R. R. Riesz, and S. A. Watkins, A Synthetic Speaker, J.Franklin Institute, Vol.227, pp. 739-764, 1939. [24] J. G. Wilpon and D. B. Roe, AT&T Telephone Network Applications of Speech Recognition, Proc. COST232 Workshop, Rome, Italy, Nov. 1992. [25] C. G. Kratzenstein, Sur la raissance de la formation des voyelles, J. Phys., Vol 21, pp. 358-380, 1782.
73
[26] H. Dudley and T. H. Tarnoczy, The Speaking Machine of Wolfgang von Kempelen, J. Acoust. Soc. Am., Vol. 22, pp. 151-166, 1950. [27] J. L. Flanagan, Speech Analysis, Synthesis and Perception, Second Edition, Springer-Verlag,1972. [28] H. Fletcher, The Nature of Speech and its Interpretations, Bell Syst. Tech. J., Vol 1, pp. 129-144, July 1922. [29] K. H. Davis, R. Biddulph, and S. Balashek, Automatic Recognition of Spoken Digits, J.Acoust. Soc. Am., Vol 24, No. 6, pp. 627-642, 952. [30] H. F. Olson and H. Belar, Phonetic Typewriter, J. Acoust. Soc. Am., Vol. 28, No. 6, pp.1072-1081, 1956. [31] J. W. Forgie and C. D. Forgie, Results Obtained from a Vowel Recognition Computer Program, J. Acoust. Soc. Am., Vol. 31, No. 11, pp. 1480-1489, 1959. [32] J. Sakai and S. Doshita, The Phonetic Typewriter, Information Processing 1962, Proc. IFIP Congress, Munich, 1962. [33] K. Nagata, Y. Kato, and S. Chiba, Spoken Digit Recognizer for Japanese Language, NEC Res. Develop., No. 6, 1963. [34] D. B. Fry and P. Denes, The Design and Operation of the Mechanical Speech Recognizer at University College London, J. British Inst. Radio Engr., Vol. 19, No. 4, pp. 211-229, 1959. [35] T. B. Martin, A. L. Nelson, and H. J. Zadell, Speech Recognition by Feature Abstraction Techniques, Tech. Report AL-TDR-64-176, Air Force Avionics Lab, 1964. [36] T. K. Vintsyuk, Speech Discrimination by Dynamic Programming, Kibernetika, Vol. 4, No. 2, pp. 81-88, Jan.-Feb. 1968. [37] H. Sakoe and S. Chiba, Dynamic Programming Algorithm Quantization for Spoken Word Recognition, IEEE Trans. Acoustics, Speech and Signal Proc., Vol. ASSP-26, No. 1, pp. 43- 49, Feb. 1978. [38] B. S. Atal and S. L. Hanauer, Speech Analysis and Synthesis by Linear Prediction of the Speech Wave, J. Acoust. Soc. Am. Vol. 50, No. 2, pp. 637-655, Aug. 1971. [39] F. Itakura and S. Saito, A Statistical Method for Estimation of Speech Spectral Density and Formant
Frequencies, Electronics and Communications in Japan, Vol. 53A, pp. 36-43, 1970. [40] F. Itakura, Minimum Prediction Residual Principle Applied to Speech Recognition, IEEE Trans. Acoustics, Speech and Signal Proc., Vol. ASSP-23, pp. 57-72, Feb. 1975. [41] L. R. Rabiner, S. E. Levinson, A. E. Rosenberg and J. G. Wilpon, Speaker Independent Recognition of Isolated Words Using Clustering Techniques, IEEE Trans. Acoustics, Speech and Signal Proc., Vol. Assp-27, pp. 336-349, Aug. 1979. [42] B. Lowerre, The HARPY Speech Understanding System, Trends in Speech Recognition, W. Lea, Editor, Speech Science Publications, 1986, reprinted in Readings in Speech Recognition, A. Waibel and K. F. Lee, Editors, pp. 576-586, Morgan Kaufmann Publishers, 1990. [43] M. Mohri, Finite-State Transducers in Language and Speech Processing, Computational Linguistics, Vol. 23, No. 2, pp. 269- 312, 1997. [44] Dennis H. Klatt, Review of the DARPA Speech Understanding Project (1), J. Acoust. Soc. Am., 62, 1345-1366, 1977. [45] F. Jelinek, L. R. Bahl, and R. L. Mercer, Design of a Linguistic Statistical Decoder for the Recognition of Continuous Speech, IEEE Trans. On Information Theory, Vol. IT-21, pp. 250- 256, 1975. [46] C. Shannon, A mathematical theory of communication, Bell System Technical Journal, vol.27, pp. 379-423 and 623-656, July and October, 1948. [47] S. K. Das and M. A. Picheny, Issues in practical large vocabulary isolated word recognition: The IBM Tangora system, in Automatic Speech and Speaker Recognition Advanced Topics, C.H. Lee, F. K. Soong, and K. K. Paliwal, editors, p. 457-479, Kluwer, Boston, 1996. [48] B. H. Juang, S. E. Levinson and M. M. Sondhi, Maximum Likelihood Estimation for Multivariate Mixture Observations of Markov Chains, IEEE Trans. Information Theory, Vol. It-32, No. 2, pp. 307-309, March 1986. [49] Dr. S.K Saxena, Rachna Jain, Delhi Technical University, Voice Automated Mobile Robot, International Journal of Computer Applications, Volume 16 No.2, February 2011
74