Abling home
   Recruitment
      Python test

Programming expertise test (Python)

This test is for applicants who are applying for a position requiring Python programming skills. Please read the instructions very carefully.


Introduction

As part of our standard recruitment procedures, we ask programming candidates to undertake a small programming exercise. This should be written in Python. The exercise consists of:

  • Part 1: writing a program to solve a problem
  • Part 2: writing an analysis of other solutions to the problem.

Please design and code this program alone. While you will have colleagues in your normal working environment, we need to see how good you are, not how good your current friends or colleagues are. If we hire you and your level of competence is not as indicated by this test, we will have to carefully consider whether your employment should continue past the probation period. If you do copy code from elsewhere, for any reason (e.g. a library routine), please clearly indicate its source.

You should be able to finish this exercise in 4 hours. Some of the best people can complete it in much less time, but the time that you take will depend ultimately on your skill and the amount of care you use.

All code, comments and documentation should be in English (ideally using British spelling conventions).

Submission

You should submit your complete source code and the analysis document (please note the names required for the files) in a zip file with the name AblingPythonTest_<your name>.zip - for example, AblingPythonTest_ChiMo.zip. You should email it to recruit@abling.com with the subject line "Python test results from <your name>".

Part 1: Programming Problem

Write a program called wordcount.py that reads a file that contains ASCII encoded text and counts the occurrences of each word. Please code the program in a single file. Please use standard Python libraries only. Make sure your code is compatible with Python 2.4.

The program will not require any user intervention to operate. It will take the first argument from the command line as the full path to the file containing the words. The second argument will be the full path to the file in which to write the results. Always overwrite the results file without asking. For example, if the command:

c:\>wordcount.py words.txt c:\test_results\countxyz.txt

is entered in a command box (also know as cmd or DOS box) then the program wordcount.py will execute (assuming it has been put into the folder c:\), read the words from the words.txt file in the local folder and write the results into the file countxyz.txt in the folder c:\test_results.

A word is a sequence of characters in the range [a-zA-Z]. Any other character is treated as a word separator.

A word may appear with mixed upper or lower case characters in the text file. Upper case characters should be converted to lower case before the word is counted.

Lines in the input file may be terminated in either the *nix style (line feed) or DOS (carriage-return, line-feed) so your program should deal sensibly with either type.

The result file is to have the words listed in alphabetical order, with one word per line, followed by ": " (colon and a space), the word count, and line separator. The line separator should be carriage-return, line-feed (i.e. the ASCII characters 0x0d 0x0a). So the word file:

Hot2hat
not/hot
nat hat-hot

would give the results:

hat: 2
hot: 3
nat: 1
not: 1

Guidelines

The guidelines below are indications of maximum size only. Your program should cope with longer words, lines and files. However, you can use this information to help you select your algorithm.

Word length: 100 characters maximum (ASCII encoding only)
Line length: 1000 characters maximum
Number of words in word list: 100,000 maximum

Quality

Quality is paramount. You should make sure that your program is coded in a professional manner and it should be thoroughly commented throughout. While running time is important to us, we do not need you to spend a lot of time tuning and we would like to see your first correct effort. Spend more time getting it right than getting it fast.

Test files

We have provided some files to help you in PythonTestFiles.zip. This has two files in it:

  • words.txt - a sample set of words
  • count.txt - the expected output from words.txt
Please ensure that your program reproduces the count.txt output exactly. We will use binary comparison to see that your program is correct.

Part 2: Analysis Document

In addition to the program, write an analysis of the algorithm you have chosen and other possible algorithms to solve the problem. Look at the expected running time of the different algorithms as a function of the number of words and the number of duplicate words. This should be no more than 4k of plain ASCII text. It should be stored in a file called analysis.txt and submitted along with your program.