using System;
using System.Collections.Generic;
using System.Linq;
public static class ProteinTranslation
{
public static string[] Proteins(string strand) =>
strand.Chunked(3).Select(ToProtein).TakeWhile(protein => protein != "STOP").ToArray();
private static string ToProtein(string input) =>
input switch
{
"AUG" => "Methionine",
"UUU" => "Phenylalanine",
"UUC" => "Phenylalanine",
"UUA" => "Leucine",
"UUG" => "Leucine",
"UCU" => "Serine",
"UCC" => "Serine",
"UCA" => "Serine",
"UCG" => "Serine",
"UAU" => "Tyrosine",
"UAC" => "Tyrosine",
"UGU" => "Cysteine",
"UGC" => "Cysteine",
"UGG" => "Tryptophan",
"UAA" => "STOP",
"UAG" => "STOP",
"UGA" => "STOP",
_ => throw new Exception("Invalid sequence")
};
private static IEnumerable<string> Chunked(this string input, int size)
{
for (var i = 0; i < input.Length; i += size)
yield return input[i .. (i + size)];
}
}
The Proteins()
method starts by calling the private static extension method Chunked()
,
which is also an iterator method.
The function uses yield
to return chunks of the string input as IEnumerable
strings.
The for
loop returns the string chunk with a range operator
that uses the size
argument to the function for the starting and ending positions of the range.
The output of Chunked()
is chained to the input of the LINQ Select()
method.
Inside the body of Select()
is a lambda function which takes the codon chunk as an argument
and passes it as an argument to the private
, static
ToProtein
method.
It is private because it isn't needed outside the class.
It is static
because it doesn't use any state from an instantiated object, so it does not need to be copied to every object,
but remains with the class.
Inside ToProtein()
it uses a switch
to look up and return the matching protein for the codon.
Each matching protein is chained from the output of Select()
to the input of the TakeWhile()
method,
which filters the proteins in a lambda based on whether the protein is a STOP
codon.
Once the lambda in TakeWhile()
encounters a failing lambda condition, it does not continue to iterate, but stops.
The proteins that survive the TakeWhile()
are chained into the input of the ToArray()
method.
The ToArray()
method is used to return an array of the matched proteins from the Proteins()
method.