MPEP 2429
Helpful Hints for Sequence Rules Compliance under WIPO ST.25

Ninth Edition of the MPEP, Revision 07.2022, Last Revised in February 2023

Previous: §2428 | Next: §2430

2429    Helpful Hints for Sequence Rules Compliance under WIPO ST.25 [R-07.2022]

[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412-2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]

The Office has the following tips regarding sequence rules in compliance with WIPO ST.25.

—Compliance is not a filing date issue.

—Compliance is not a 35 U.S.C. 112 issue.

—Compliance is not a 35 U.S.C. 119 /120 issue.

—Compliance is not per se a new matter issue. The standard for resolution of inconsistencies between the "Sequence Listing" (submitted as an ASCII plain text file, on read-only optical disc(s), as a PDF image file, or on physical sheets of paper pursuant to 37 CFR 1.821(c) ) and a separate computer readable form thereof pursuant to 37 CFR 1.821(e) (if required) and/or errors in the "Sequence Listing" is based on the new matter standard.

—Compliance can be achieved via amendment. See MPEP § 2426 for additional information regarding amendments to add or replace a "Sequence Listing" and CRF thereof. See 37 CFR 1.825.

—If sequence information is submitted in an application filed under 35 U.S.C. 111(a) or 35 U.S.C. 371 as an ASCII plain text file in compliance with 37 CFR 1.824 via the USPTO patent electronic filing system or on read-only optical disc(s) and applicant has not filed a "Sequence Listing" as a PDF image file or on physical sheets of paper, the ASCII plain text file will serve as both the "Sequence Listing" under 37 CFR 1.821(c) and the CRF of the "Sequence Listing" under 37 CFR 1.821(e). See 37 CFR 1.821(e)(1). Thus, the following are not required and should not be submitted: (1) a second copy of the "Sequence Listing" as a PDF image file or on physical sheets of paper; and (2) a statement under 37 CFR 1.821(e)(1)(ii) or (2)(ii) (indicating that the sequence information contained in the "Sequence Listing" under 37 CFR 1.821(c) and CRF copy of the "Sequence Listing" under 37 CFR 1.821(e)(1)(ii) or 37 CFR 1.21(o) are identical). Any "Sequence Listing" submitted as an ASCII plain text file via the USPTO patent electronic filing system or on read-only optical disc(s) under 37 CFR 1.52(e) and in compliance with 37 CFR 1.821(c) will be excluded when determining the application size fee required by 37 CFR 1.16(s) or 37 CFR 1.492(j) as per 37 CFR 1.52(f)(1) and (2). See MPEP § 2422.03(a) for additional information. See MPEP § 2422.03(a) for additional information.

—The USPTO encourages applicants to file their patent applications via the USPTO patent electronic filing system and imposes a surcharge for non-electronic filing of an original patent application (excluding reissue, design, plant, and provisional applications). Filing a "Sequence Listing" via USPTO patent electronic filing system as a PDF image file or on physical sheets of paper is not recommended. A "Sequence Listing" in PDF format or on physical sheets of paper is treated as the "Sequence Listing" required by 37 CFR 1.821(c) and requires filing of both a separate CRF and a statement that the "Sequence Listing" and the CRF are identical in an application filed under 35 U.S.C. 111(a) regardless of if the sequence information is also filed as an ASCII plain text file, as required by 37 CFR 1.821(e)(1)(ii), and in a national stage application filed under 35 U.S.C. 371 if the sequence information is not also filed as an ASCII plain text file, as required by 37 CFR 1.821(e)(2)(ii). In addition, a "Sequence Listing" submitted in PDF format or on physical sheets of paper as part of the specification is not excluded when determining the application size fee required by 37 CFR 1.16(s) or 1.492(j). See 37 CFR 1.52(f)(1) and (2).

—For international applications (PCT), the check list of the PCT Request filed with the international application must contain an indication that the sequence listing, filed with the PCT application on the international filing date forms part of the international application. See MPEP § 2422.03(a), subsection IV, for information specific to filing sequence listings in international applications (PCT) via the USPTO patent electronic filing system.

—Applicants are reminded that for fee purposes, a table of sequences is not a "Sequence Listing". Such tables are considered part of the specification and are included when determining the application size fee required by 37 CFR 1.16(s) or 1.492(j). See 37 CFR 1.52(f)(1) and (2).

—Applicants are encouraged to draft their specifications such that sequence data that is not essential material is not required to be included in a "Sequence Listing". 37 CFR 1.21(o) and 1.52(f)(3) provide that the submission of an oversized "Sequence Listing" (a mega-"Sequence Listing") of 300 MB or more are subject to additional fees. A mega-"Sequence Listing", in particular, often include sequences that are available in the prior art, are not essential material, and could have been described instead, for example, by name and a publication or accession reference.

—Failure to reply to sequence compliance issues in a timely manner may reduce any patent term adjustment. Patent applications filed under 35 U.S.C. 111(a) on or after December 18, 2013, and international patent applications in which the national stage commenced under 35 U.S.C. 371 on or after December 18, 2013, may be subject to reductions in patent terms adjustment pursuant to 37 CFR 1.704(c)(13) if they are not in condition for examination within eight months from the filing date or date of commencement, respectively. "In condition for examination" includes compliance with 37 CFR 1.821 - 1.825 (see 37 CFR 1.704(f) ).

—The copy of the "Sequence Listing" required by 37 CFR 1.821(c) is an integral part of the application. If submitted as a PDF image file or on physical sheets of paper, the "Sequence Listing" must begin on a new page, should appear at the end of the application, and preferably should be numbered independently of the numbering of the remainder of the application. The new page that begins the "Sequence Listing" should be entitled "Sequence Listing." See 37 CFR 1.823(b)(3). If not submitted as such at filing, the "Sequence Listing" must be inserted into the application via amendment, e.g., by preliminary amendment. See 37 CFR 1.825. If submitted as an ASCII plain text file via the USPTO patent electronic filing system or read-only optical disc, the specification must contain an incorporation by reference of the material, except for a national stage entry under 37 CFR 1.495(b)(1), where the "Sequence Listing" has been previously communicated to the International Bureau or originally filed in the USPTO and complies with Patent Cooperation Treaty Rule 5.2.

—A replacement "Sequence Listing" must be used to amend a "Sequence Listing" regardless of whether the "Sequence Listing" was filed as an ASCII plain text file via the USPTO patent electronic filing system, on read-only optical disc(s) (37 CFR 1.821(c)(1), as a PDF file via the USPTO patent electronic filing system (37 CFR 1.821(c)(2), or on physical sheets of paper (37 CFR 1.821(c)(3). See MPEP § 2426 for additional information regarding amendments to add or replace a "Sequence Listing" and CRF copy thereof.

—The practice of computer readable form transfers from one application to another has been eliminated.

—Angle brackets and numeric identifiers listed in 37 CFR 1.823 and Appendix G to Subpart G of Part 1 of the CFR (reproduced in MPEP § 2424) are very important for our database. Extra punctuation should not be used in a "Sequence Listing".

—A "Sequence Listing" (37 CFR 1.821(c) ) or a separate CRF of a "Sequence Listing" (37 CFR 1.821(e) ) as an ASCII plain text file cannot contain page numbers. Page numbers should only be placed on PDF image files or on physical sheets of paper of the "Sequence Listing".

—The PatentIn computer program is not the only means by which to comply with the rules. Any word processing program can be used to generate a "Sequence Listing" if it has the capability to convert a file into ASCII plain text. However, use of a word processing program to generate or amend a "Sequence Listing" file is discouraged. Word processing programs often introduce unintended changes to the "Sequence Listing" that render the listing unacceptable. Use of a plain text editor to generate or edit a "Sequence Listing" is recommended.

—If a word processing program is used to generate a "Sequence Listing", hard page break controls should not be used and margins should be adjusted to the smallest setting.

—Word processing files should not be submitted to the Office; the "Sequence Listing" generated by a word processing file should be saved as an ASCII plain text file for submission. Most word processing programs provide this feature.

—Statements in accordance with 37 CFR 1.821(e)(1)(ii), (e)(2)(ii), (e)(3)(iii), and (h) and 37 CFR 1.825(a)(3), (a)(4), (a)(6), (b)(3), (b)(4), (b)(5), and (b)(7) and proper labeling for read-only optical disc(s) in accordance with 37 CFR 1.52(e)(5)(vi) should be noted. Sample statements to support filings and submissions in accordance with 37 CFR 1.821 through 1.825 are provided in MPEP § 2428 Sample Statements.

—Use Box SEQUENCE. See MPEP § 2433.

—On nucleotide sequences, since only single strands may be depicted in the "Sequence Listing", show strands in 5' to 3' direction from left to right in accordance with 37 CFR 1.822(c)(5).

—The single stranded nucleotide depicted in the "Sequence Listing" may represent a strand of a nucleotide sequence that may be single or double stranded which may be, further, linear or circular. An amino acid sequence or peptide may be linear or circular.

—Numeric identifiers "<140>, Current Application Number," "<141>, Current Filing Date," "<150>, Prior Application Number," and "<151>, Prior Application Filing Date," should appear in the "Sequence Listing" in all cases. If the information about the current application is not known or is unavailable at the time of completing the "Sequence Listing", then the lines following numeric identifiers <140> and <141> should be left blank. This would normally be the case when the "Sequence Listing" is included in a newly filed application. Similarly, if information regarding prior applications is inapplicable, or not known at the time of completing the "Sequence Listing" but will be later filed, then the numeric identifiers <150> and <151> should appear with the line following the numeric identifiers left blank.

—The mandatory items of information that must be included in a "Sequence Listing" are identified in the table of numeric identifiers set forth Appendix G to Subpart G of Part 1 of the CFR. See also MPEP § 2424.02.

—Pursuant to 37 CFR 1.83(a), sequences that are included in a "Sequence Listing" should not be duplicated in the drawings. However, significant sequence characteristics that are not readily conveyed by the data in the "Sequence Listing" may be depicted in a drawing figure. However, the sequence information so conveyed must still be included in a "Sequence Listing" if the sequence falls within the definition set forth in 37 CFR 1.821(a), and the sequence identifier ("SEQ ID NO:X or the like") must be used, either in the drawing or in the "Brief Description of the Drawings." See MPEP § 2422.02 for additional information.

—Inosine may be represented by the use of "I" in the features section, otherwise use "n."

—Stop codons that are represented by an asterisk are not permitted in amino acid sequences.

—Punctuation should not be used in a sequence to indicate unknown nucleotide bases or amino acid residues or to delimit active or functional regions of a sequence. These regions should be noted as Features of the sequence per Appendix G to Subpart G of Part 1 of the CFR (see numeric identifiers <220> - <223>. (Appendix G is reproduced in MPEP § 2424).

—The presence of an unnatural amino acid in a sequence does not have the same effect as the presence of a D-amino acid. The sequence may still be subject to the rules even though one or more of the amino acids is not naturally occurring.

—Cyclic and branched peptides are causing some confusion in the application of the rules. Specific questions should be directed to Sequence Systems Service Center of the Scientific and Technical Information Center at 571-272-2510.

—A cyclic peptide with a tail is regarded as a branched sequence, and thereby exempt from the rules, if all bonds adjacent to the amino acid from which the tail emanates are normal peptide bonds.

—Sequences that have variable-length regions depicted as, for example, Ala Ala Leu Leu (Xaa Xaa)n Ile Pro where n=0-234 or agccttgggaca(nnnnn)mgtcatt where m=0-354 or Ser Met Ala Xaa Ser where Xaa could be 1, 2, 3, 4 and/or 5 amino acids must still comply with the Sequence Rules. The method to use is to repeat the variable-length region as many times as the maximum length and specify in the Features section that the amino acid (or nucleotide) at a specified position is either absent or present. The variables Xaa and n may stand for only one residue, hence the need to repeat the variable. The correct way to submit the third example is Ser Met Ala Xaa Xaa Xaa Xaa Xaa Ser combined with an explanation in the Features section of the listing that any one or all of amino acids 4-8 can either be present or absent.

—Single letter amino acid abbreviations are not acceptable within the "Sequence Listing" but may appear elsewhere in the application.

—Zero (0) is not used when the numbering of amino acids uses negative numbers to distinguish the mature protein.

—Subscripts or superscripts are not permitted in a "Sequence Listing".

—The exclusive conformance requirement of 37 CFR 1.821(b) requires that any amendment of the sequence information in a "Sequence Listing" be accompanied by an amendment to the corresponding information, if any, embedded in the text of the specification or presented in a drawing figure.

—A mandatory feature is required to cover every "n" or "Xaa" used in a sequence. The feature consists of numeric identifiers <220>, <221>, <222>, and <223>. Numeric identifier <220> should remain blank, numeric Identifier <221> should be selected from Appendices E and F to Subpart G of Part 1 of the CFR (reproduced in MPEP § 2422(I)), numeric identifier <222> should identify the location of the "n" or "Xaa" within the sequence, and numeric identifier should <223> specify what the "n" or "Xaa" can be. When all of the "n" or "Xaa" variables in a sequence are equal to the same thing, a range of the entire sequence can be given for numeric identifier <222> to cover all of the "n" or "Xaa" designators in one feature.

—Remove all non-ASCII characters from the.txt file. For example, an α symbol should be spelled out as "alpha."

—Tabs are non-ASCII characters. Do not use tabs in "Sequence Listing".txt files.

—Make all explanations in a feature section consistent with the molecule type in numeric identifier <212>. For example, if the sequence is type "PRT" do not describe the sequence in a feature section as a "synthetic oligonucleotide."

—A response for numeric identifier <130>, "File Reference," is mandatory if the numeric identifier <140> is not present, e.g., when the "Sequence Listing" is filed before the application number has been assigned. At least one of a numeric identifier <130> with docket number or numeric identifier <140> with current application number must be in the "Sequence Listing". This information is used to ensure that ASCII plain text files are correctly matched to their corresponding applications.

—If a "Sequence Listing" is modified by the addition or deletion of sequences, remember to update the total number of sequences in numeric identifier <160>.

—Numeric identifier <213> can only be one of three choices: Scientific name (i.e. Genus/species), Unknown, or Artificial Sequence. Do not add any extraneous information about the sequence, such as a gene names, in this field. Do not use common names for species. For example, human should be "Homo sapiens" and cow should be "Bos taurus." If a specific genus/species is unknown, use the reply "Unknown" in numeric identifier <213> and add whatever information is known into numeric identifier <223> of the feature section. For example, if only the family "Saccharomycetaceae" is known, numeric identifier <213> should state "Unknown" and numeric identifier <223> could state "fungus of the family Saccharomycetaceae."

—For all sequences using "Unknown" or "Artificial sequence" for numeric identifier <213>, a mandatory feature is required to explain the source of the genetic material. The feature consists of <220>, which remains blank, and <223>, which states the source of the genetic material. To explain the source, if the sequence is put together from several organisms, please list those organisms. If the sequence is made in the laboratory, please indicate that the sequence is synthesized.

—Only use abbreviations that are specifically defined in "WIPO Standard ST.25 (2009)" or that are well known. Do not use abbreviations that are specific to the application at issue and would not be clear to someone who had not read the invention description. When in doubt, use the full name rather than an abbreviation.

—Note that if a "Sequence Listing" provided as an ASCII plain text file or a separate CRF of a "Sequence Listing" is rejected and an error report issued, the errors listed are exemplary and may not be a complete list of all errors in the "Sequence Listing" file. The applicant is required to review the "Sequence Listing" in its entirety and correct all instances of similar errors.

—Any inquiries regarding a specific "Sequence Listing" provided as an ASCII plain text file or a separate CRF of a "Sequence Listing" that has been processed by the Office should be directed to the Sequence Systems Service Center of the Scientific and Technical Information Center at 571-272-2510.