Skip to content


Base FileIO classes for implementing reading and writing table files.

The FileIO abstraction includes a subset of full filesystem implementations. Specifically, Iceberg needs to read or write a file at a given location (as a seekable stream), as well as check if a file exists. An implementation of the FileIO abstract base class is responsible for returning an InputFile instance, an OutputFile instance, and deleting a file given its location.


Bases: ABC

A base class for FileIO implementations.

Source code in pyiceberg/io/
class FileIO(ABC):
    """A base class for FileIO implementations."""

    properties: Properties

    def __init__(self, properties: Properties = EMPTY_DICT): = properties

    def new_input(self, location: str) -> InputFile:
        """Get an InputFile instance to read bytes from the file at the given location.

            location (str): A URI or a path to a local file.

    def new_output(self, location: str) -> OutputFile:
        """Get an OutputFile instance to write bytes to the file at the given location.

            location (str): A URI or a path to a local file.

    def delete(self, location: Union[str, InputFile, OutputFile]) -> None:
        """Delete the file at the given path.

            location (Union[str, InputFile, OutputFile]): A URI or a path to a local file--if an InputFile instance or
                an OutputFile instance is provided, the location attribute for that instance is used as the URI to delete.

            PermissionError: If the file at location cannot be accessed due to a permission error.
            FileNotFoundError: When the file at the provided location does not exist.

delete(location) abstractmethod

Delete the file at the given path.


Name Type Description Default
location Union[str, InputFile, OutputFile]

A URI or a path to a local file--if an InputFile instance or an OutputFile instance is provided, the location attribute for that instance is used as the URI to delete.



Type Description

If the file at location cannot be accessed due to a permission error.


When the file at the provided location does not exist.

Source code in pyiceberg/io/
def delete(self, location: Union[str, InputFile, OutputFile]) -> None:
    """Delete the file at the given path.

        location (Union[str, InputFile, OutputFile]): A URI or a path to a local file--if an InputFile instance or
            an OutputFile instance is provided, the location attribute for that instance is used as the URI to delete.

        PermissionError: If the file at location cannot be accessed due to a permission error.
        FileNotFoundError: When the file at the provided location does not exist.

new_input(location) abstractmethod

Get an InputFile instance to read bytes from the file at the given location.


Name Type Description Default
location str

A URI or a path to a local file.

Source code in pyiceberg/io/
def new_input(self, location: str) -> InputFile:
    """Get an InputFile instance to read bytes from the file at the given location.

        location (str): A URI or a path to a local file.

new_output(location) abstractmethod

Get an OutputFile instance to write bytes to the file at the given location.


Name Type Description Default
location str

A URI or a path to a local file.

Source code in pyiceberg/io/
def new_output(self, location: str) -> OutputFile:
    """Get an OutputFile instance to write bytes to the file at the given location.

        location (str): A URI or a path to a local file.


Bases: ABC

A base class for InputFile implementations.


Name Type Description Default
location str

A URI or a path to a local file.



Name Type Description
location str

The URI or path to a local file for an InputFile instance.

exists bool

Whether the file exists or not.

Source code in pyiceberg/io/
class InputFile(ABC):
    """A base class for InputFile implementations.

        location (str): A URI or a path to a local file.

        location (str): The URI or path to a local file for an InputFile instance.
        exists (bool): Whether the file exists or not.

    def __init__(self, location: str):
        self._location = location

    def __len__(self) -> int:
        """Return the total length of the file, in bytes."""

    def location(self) -> str:
        """The fully-qualified location of the input file."""
        return self._location

    def exists(self) -> bool:
        """Check whether the location exists.

            PermissionError: If the file at self.location cannot be accessed due to a permission error.

    def open(self, seekable: bool = True) -> InputStream:
        """Return an object that matches the InputStream protocol.

            seekable: If the stream should support seek, or if it is consumed sequential.

            InputStream: An object that matches the InputStream protocol.

            PermissionError: If the file at self.location cannot be accessed due to a permission error.
            FileNotFoundError: If the file at self.location does not exist.

location property

The fully-qualified location of the input file.

__len__() abstractmethod

Return the total length of the file, in bytes.

Source code in pyiceberg/io/
def __len__(self) -> int:
    """Return the total length of the file, in bytes."""

exists() abstractmethod

Check whether the location exists.


Type Description

If the file at self.location cannot be accessed due to a permission error.

Source code in pyiceberg/io/
def exists(self) -> bool:
    """Check whether the location exists.

        PermissionError: If the file at self.location cannot be accessed due to a permission error.

open(seekable=True) abstractmethod

Return an object that matches the InputStream protocol.


Name Type Description Default
seekable bool

If the stream should support seek, or if it is consumed sequential.



Name Type Description
InputStream InputStream

An object that matches the InputStream protocol.


Type Description

If the file at self.location cannot be accessed due to a permission error.


If the file at self.location does not exist.

Source code in pyiceberg/io/
def open(self, seekable: bool = True) -> InputStream:
    """Return an object that matches the InputStream protocol.

        seekable: If the stream should support seek, or if it is consumed sequential.

        InputStream: An object that matches the InputStream protocol.

        PermissionError: If the file at self.location cannot be accessed due to a permission error.
        FileNotFoundError: If the file at self.location does not exist.


Bases: Protocol

A protocol for the file-like object returned by

This outlines the minimally required methods for a seekable input stream returned from an InputFile implementation's open(...) method. These methods are a subset of IOBase/RawIOBase.

Source code in pyiceberg/io/
class InputStream(Protocol):
    """A protocol for the file-like object returned by

    This outlines the minimally required methods for a seekable input stream returned from an InputFile
    implementation's `open(...)` method. These methods are a subset of IOBase/RawIOBase.

    def read(self, size: int = 0) -> bytes: ...

    def seek(self, offset: int, whence: int = SEEK_SET) -> int: ...

    def tell(self) -> int: ...

    def close(self) -> None: ...

    def __enter__(self) -> InputStream:
        """Provide setup when opening an InputStream using a 'with' statement."""

    def __exit__(
        self, exctype: Optional[Type[BaseException]], excinst: Optional[BaseException], exctb: Optional[TracebackType]
    ) -> None:
        """Perform cleanup when exiting the scope of a 'with' statement."""


Provide setup when opening an InputStream using a 'with' statement.

Source code in pyiceberg/io/
def __enter__(self) -> InputStream:
    """Provide setup when opening an InputStream using a 'with' statement."""

__exit__(exctype, excinst, exctb) abstractmethod

Perform cleanup when exiting the scope of a 'with' statement.

Source code in pyiceberg/io/
def __exit__(
    self, exctype: Optional[Type[BaseException]], excinst: Optional[BaseException], exctb: Optional[TracebackType]
) -> None:
    """Perform cleanup when exiting the scope of a 'with' statement."""


Bases: ABC

A base class for OutputFile implementations.


Name Type Description Default
location str

A URI or a path to a local file.



Name Type Description
location str

The URI or path to a local file for an OutputFile instance.

exists bool

Whether the file exists or not.

Source code in pyiceberg/io/
class OutputFile(ABC):
    """A base class for OutputFile implementations.

        location (str): A URI or a path to a local file.

        location (str): The URI or path to a local file for an OutputFile instance.
        exists (bool): Whether the file exists or not.

    def __init__(self, location: str):
        self._location = location

    def __len__(self) -> int:
        """Return the total length of the file, in bytes."""

    def location(self) -> str:
        """The fully-qualified location of the output file."""
        return self._location

    def exists(self) -> bool:
        """Check whether the location exists.

            PermissionError: If the file at self.location cannot be accessed due to a permission error.

    def to_input_file(self) -> InputFile:
        """Return an InputFile for the location of this output file."""

    def create(self, overwrite: bool = False) -> OutputStream:
        """Return an object that matches the OutputStream protocol.

            overwrite (bool): If the file already exists at `self.location`
                and `overwrite` is False a FileExistsError should be raised.

            OutputStream: An object that matches the OutputStream protocol.

            PermissionError: If the file at self.location cannot be accessed due to a permission error.
            FileExistsError: If the file at self.location already exists and `overwrite=False`.

location property

The fully-qualified location of the output file.

__len__() abstractmethod

Return the total length of the file, in bytes.

Source code in pyiceberg/io/
def __len__(self) -> int:
    """Return the total length of the file, in bytes."""

create(overwrite=False) abstractmethod

Return an object that matches the OutputStream protocol.


Name Type Description Default
overwrite bool

If the file already exists at self.location and overwrite is False a FileExistsError should be raised.



Name Type Description
OutputStream OutputStream

An object that matches the OutputStream protocol.


Type Description

If the file at self.location cannot be accessed due to a permission error.


If the file at self.location already exists and overwrite=False.

Source code in pyiceberg/io/
def create(self, overwrite: bool = False) -> OutputStream:
    """Return an object that matches the OutputStream protocol.

        overwrite (bool): If the file already exists at `self.location`
            and `overwrite` is False a FileExistsError should be raised.

        OutputStream: An object that matches the OutputStream protocol.

        PermissionError: If the file at self.location cannot be accessed due to a permission error.
        FileExistsError: If the file at self.location already exists and `overwrite=False`.

exists() abstractmethod

Check whether the location exists.


Type Description

If the file at self.location cannot be accessed due to a permission error.

Source code in pyiceberg/io/
def exists(self) -> bool:
    """Check whether the location exists.

        PermissionError: If the file at self.location cannot be accessed due to a permission error.

to_input_file() abstractmethod

Return an InputFile for the location of this output file.

Source code in pyiceberg/io/
def to_input_file(self) -> InputFile:
    """Return an InputFile for the location of this output file."""


Bases: Protocol

A protocol for the file-like object returned by OutputFile.create(...).

This outlines the minimally required methods for a writable output stream returned from an OutputFile implementation's create(...) method. These methods are a subset of IOBase/RawIOBase.

Source code in pyiceberg/io/
class OutputStream(Protocol):  # pragma: no cover
    """A protocol for the file-like object returned by OutputFile.create(...).

    This outlines the minimally required methods for a writable output stream returned from an OutputFile
    implementation's `create(...)` method. These methods are a subset of IOBase/RawIOBase.

    def write(self, b: bytes) -> int: ...

    def close(self) -> None: ...

    def __enter__(self) -> OutputStream:
        """Provide setup when opening an OutputStream using a 'with' statement."""

    def __exit__(
        self, exctype: Optional[Type[BaseException]], excinst: Optional[BaseException], exctb: Optional[TracebackType]
    ) -> None:
        """Perform cleanup when exiting the scope of a 'with' statement."""

__enter__() abstractmethod

Provide setup when opening an OutputStream using a 'with' statement.

Source code in pyiceberg/io/
def __enter__(self) -> OutputStream:
    """Provide setup when opening an OutputStream using a 'with' statement."""

__exit__(exctype, excinst, exctb) abstractmethod

Perform cleanup when exiting the scope of a 'with' statement.

Source code in pyiceberg/io/
def __exit__(
    self, exctype: Optional[Type[BaseException]], excinst: Optional[BaseException], exctb: Optional[TracebackType]
) -> None:
    """Perform cleanup when exiting the scope of a 'with' statement."""


Return the path without the scheme.

Source code in pyiceberg/io/
def _parse_location(location: str) -> Tuple[str, str, str]:
    """Return the path without the scheme."""
    uri = urlparse(location)
    if not uri.scheme:
        return "file", uri.netloc, os.path.abspath(location)
    elif uri.scheme in ("hdfs", "viewfs"):
        return uri.scheme, uri.netloc, uri.path
        return uri.scheme, uri.netloc, f"{uri.netloc}{uri.path}"