Order KöMaL! Competitions Portal

S. 9. Nowadays one can easily download DNA samples from the Internet. Such a sample consists of combinations of letters A, C, T and G, that is, the abbreviations of the four nucleotide bases adenine, cytosine, guanine and thymine.

Your task is to locate a given sequence (a gene'') in the given sample of letters A, C, T or G. The sample and the sequence are given in two text files. The first row of each file contains the number of letters, then the sample (or the sequence) itself of letters A, C, T, G follows. For the sake of readability, the lines are wrapped to have at most 100 characters.

Your program gets the necessary file names from the command line: first the name of the sample file, then the name of the file containing the desired sequence will be given. Your program should send a 0 to the standard output, if the sample does not contain the sequence, otherwise the output should be i, if the first occurrence of the sequence in the sample begins at the ith position. (Positions are numbered from 1.)

It can be assumed that the sample has at most 50 million characters, while the sequence consists of at most 1 million characters.

(10 points)

Deadline expired on 15 June 2005.

Statistics on problem S. 9.
 7 students sent a solution. 10 points: Engedy Balázs, Treszkai László. 9 points: Deák 666 Áron, Vincze János. 8 points: 2 students. 0 point: 1 student.

• Problems in Information Technology of KöMaL, May 2005

•  Our web pages are supported by: ELTE Morgan Stanley