Finding Rhymes in Hip Hop lyrics (and other text)

Overview

This project was developed by Bradley Buda at the University of Michigan for the EECS 595 Natural Language Processing class. A detailed paper describing the system and sample code are available below.

Abstract

This paper presents a system for identifying rhyming words in a block of text. The system is designed for finding rhymes in hip-hop (rap) music lyrics, but is general enough to work for any block of text containing rhymes. The system runs on Windows or Linux using the .NET Framework. It relies on the CMU Pronouncing Dictionary to find the constituent sounds for the supplied words, then uses a series of custom algorithms to find patterns that indicate rhymes. The system is capable of finding rhymes that span multiple words. Experimental results indicate that the system finds nearly all the intentional rhymes in a lyric, but fails in that it also finds a large number of unintentional rhymes. This paper describes the design of the system, describes the algorithms used in the system, gives experimental results, and presents an analysis of the system's successes and shortcomings and suggestions for future work.

Download the entire paper (969 KB PDF File)

Source Code

Download the source code (1.4 MB Zip File)

The source code for this project, written in C#, is freely available (see Licensing below).

Building the Source - Windows

Using Visual Studio.net 2003

The easiest way to use this code is on Windows with Visual Studio.net 2003. Just unzip the code file and load the 595-project.sln solution file. Build and execute the RhymeFinderTest project to search for rhymes in a text file, or the RhymeTester project to see if two words rhyme.

Using the Command Line

If you don't have Visual Studio, you can still build the code. You need to obtain the free Microsoft .NET Framework SDK. Download and install the SDK, and start the SDK command line. In order to build the project, change to the folder where you unzipped the source code and type:

csc /out:RhymeFinder.exe Base\*.cs RhymeFinderTest\*.cs

or to build the Rhyme tester:

csc /out:Rhyme.exe Base\*.cs RhymeTester\*.cs

You may see some warnings during the build process - you may ignore these.

Building the Source - Linux

To use this project on Linux, you will need to obtain Mono, an open-source implementation of the .NET Framework. Mono 1.05 (the current version as of this writing) can be downloaded here (there is also a Windows version if you would rather not use the MS implementation). Once Mono is installed, you can build the project with the command:

mcs Base/*.cs RhymeFinderTest/*.cs -o RhymeFinder.exe

and execute it with:

mono RhymeFinder.exe

You can also build and execute the Rhyme tester - see the Windows instructions above.

Using the program:

The project is at a very early stage. It may crash and the algorithms are far from bulletproof. The program accepts a filename which must consist of space-seperated words and no punctuation. The source code comes with five sample files:

Licensing

The paper describing this work is available under the Creative Commons Share-Alike license. The code is available under the GPL. If you have any questions about licensing or appropriate use of these items, please constact the author (see below).

Acknowledgements

The author would like to thank the people at the CMU Pronouncing Dictionary for their invaluable tool. The author would also like to thank Nick Shawver for his consultations on the project.

Contacting the Author

This is an experimental student project and I am unable to provide detailed support for the source code; if you wish to modify or change it, or even if you're having trouble, you're pretty much on your own. However, I can answer simple question about the code - I can be contacted at bradleybuda {at} gmail {dot} com.


Copyright Bradley Buda, 2004