Due to formalin fixation and potentially long storage in non-ideal conditions, DNA extracted from FFPE tissue is often severely damaged and sequence artifacts are frequently detected. In particular, cytosine to thymine transitions constitutes the majority of these artifacts.While the exact mechanism for this is not known, one explanation is that deamination of cytosine leads to a uracil in that position, which will pair with adenine. Upon sequencing, this modified base will be read as a C>T transition. If strand information is not preserved during library construction, either strand may be sequenced, and thus the artifact may appear as either a C>T or a G>A transition. Removal of these artifacts is essential, particularly when used for sequencing analysis of cancer samples, as they may otherwise appear as false positive mutations.
In this study, we have developed a novel protocol for extracting DNA from FFPE samples including an artifact removal step that specifically targets C>T and G>A respectively caused by deamination. The improved protocol resulted in up to 2.5x higher yields depending on the quality of the sample source. Sequenced results on DNA extracted from either normal or carcinoma liver tissue samples confirmed that C>T and G>A were the most prevalent mutations at low frequencies (<0.3). An enzymatic artifact removal step during extraction procedure was able to remove >90% of C>T/G>A variants at frequencies <0.3 as opposed to current standard methods. Further analysis of the sequencing revealed 4 COSMIC C>T/G>A mutations present in the carcinoma liver sample DNA obtained with a standard protocol were not present using the novel protocol indicating that the new kit specifically targets artificial mutations while leaving true mutations unchanged.