MPEP 2423.01
Format and Symbols To Be Used in Sequence Listings

This is the Ninth Edition of the MPEP, Revision 08.2017, Last Revised in Januay 2018

Previous: §2423 | Next: §2423.02

2423.01    Format and Symbols To Be Used in Sequence Listings [R-07.2015]

37 CFR 1.822 sets forth the format and symbols to be used for listing nucleotide and/or amino acid sequence data. The symbols for representing the nucleotide and/or amino acid characters in the sequences are set forth in the tables of WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. See MPEP § 2422. No other symbols shall be used in nucleotide and amino acid sequences. The "modified base" and "modified and unusual amino acid" symbols appearing in WIPO Standard ST.25 (1998), Appendix 2, Tables 2 and 4 (see 37 CFR 1.822 and MPEP § 2422) are not to be set forth in the sequences recited in the sequence listing. However, "modified base" or "modified and unusual amino acid" symbols may be used in the written description and/or drawing portions of the specification. To properly enter notations for modified bases or amino acids in the sequence listing, the Feature section of the sequence listing should be used. That is, a modified base or amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or amino acid is one of those listed in WIPO Standard ST.25 (1998), Appendix 2, Table 2 or 4 and the modification is also set forth in the Feature section of the sequence listing. Otherwise, all nucleotide bases or amino acids not appearing in WIPO Standard ST.25 (1998), Appendix 2, Table 1 or 3 must be listed in a given sequence as "n" or "Xaa," respectively, with further information given in the Feature section of the sequence listing. See 37 CFR 1.822(b).

In 37 CFR 1.822(b) and 37 CFR 1.822(d), the use of three-letter symbols for amino acids is required in the sequence listing. The three-letter symbols must be presented using the upper case for the first character and lower case for the remaining two characters. Applicants are encouraged to use the three-letter symbols for amino acids throughout the disclosure, instead of the one-letter symbols, for easier reading of the application and any patent issuing therefrom.

37 CFR 1.822(c) through (e) set forth the format for presenting sequence data. These paragraphs set forth the manner in which the characters in sequences are to be grouped, spaced, presented and numbered.