Possible Regex 2-26-13
2015-01-13azim58 - Possible Regex 2-26-13
Regex to transform a FASTA file into
Example fasta file found here
"C:\kurt\storage\CIM Research
Folder\DR\2013\2-26-13\regex_test\example_fasta_2-26-13.txt"
It looks like I will not be able to use a simple regular expression
because if I want to match multiple lines after the > header I cannot
call all of these lines back. This is because a group just matches once.
If a regex group matches multiple times, only the last instance of that
group can be called back. Oh wait, actually I can probably make a group
that would match everything up to the next header.
I think I can transform the text the way that I need to in two moves (two
find and replaces).
Note that one new line with a ">" must be added at the end of the file.
The first find and replace is like this.
f
(>.+?)\r(\s(?=\S)+?(?=>))
r
$1\t$2
Then the text will be in a format like this.
"C:\kurt\storage\CIM Research
Folder\DR\2013\2-26-13\regex_test\1st_transform_0913.txt"
Then this regex can be used to reformat this new text into the final
desired column text.
f
(.+?)(?=\r)(?!\r>)(\r)
r
$1
The resulting text looks like this.
"C:\kurt\storage\CIM Research
Folder\DR\2013\2-26-13\regex_test\2nd_transform_0934.txt"
I know this works in regexr. Now I just need to verify that this works in
Notepad++. For Notepad++ I must change the \r to \r\n.
So the 1st move should be
f
(>.+?)\n(\s(?=\S)+?(?=>))
r
$1\t$2
2nd move
f
(.+?)(?=\n)(?!\n>)(\n)
r
$1