Splitting a File

Make no mistake: my coding fu remains quite weak. Still, something as simple as splitting a text file into smaller files based on a small string that sits on a line by itself shouldn’t be too hard of a problem. Well, the splitting isn’t that hard when someone hands you a Perl script. What’s hard is finding a way to split the files for yourself and also to have the files named after the string by which they were split.

What do I mean by this? I have a large text file, two of them actually, which are made up of over one hundred texts each. Each text begins with `–###–` and proceeds for some number of lines before the next `–###–` occurs.

I would like to split these larger files into their constituent parts and have each of those parts be contained in a file named, `###`. This shouldn’t be as hard as it is. I have tried:

split -p ‘^–[0-9][0-9][0-9]–‘ mytext.txt


csplit -k individuals.txt /–[0-9][0-9][0-9]–/

And that’s just to split the file. (Neither worked.) This Perl script did:


@arr = split(‘–[0-9][0-9][0-9]–‘);
for $c (@arr)
print FO $c;

But it simply labels the files by number, which loses their original identifying number.