Process Subfiles

To filter all files in a container file, you can open the container as a Document object, then open its subfiles as Document objects. After you open a subfile as a Document, you can call its methods to filter the data.

You can iterate over subfile objects by calling the subfiles method on a document. Each element returned by the iterator contains information about the subfile, and a method to open it as a document.

In the following example, we open each subfile as a document, and filter some text from it.

Copy

for subfile in doc.subfiles:
    with subfile.open() as (child, extract_info):
        if child:
            child.filter(sys.stdout)

Extract Subfiles

In some cases, you might need to access the subfiles directly, for example to archive the subfiles or process them using a different tool. In this case, you can open the container as a Document object, then open its subfiles as Document objects. After you open a subfile as a Document, you can call its methods to filter the data.

NOTE: Some options change the order in which subfiles are retrieved, such as enabling the root node. However, for each combination of options, the subfile order is consistent across multiple runs.