[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

5. 入力と出力

GNU Hurdのほとんど全てのサーバで相互に作用するために使われているので、 I/Oサブシステムに伴う特定のプログラムやサーバはない。それはI/Oチャネルを 読んだり書いたりするための能力を提供しており、I/OチャネルはGNU Cライブラ リにおけるファイルやソケットの記述子の土台となる実装である。


5.1 iohelpライブラリ

<hurd/iohelp.h>ファイルは低水準のI/Oの実装に役立つ、いくつかの関 数を宣言している。ほとんどのHurdサーバはこれらの関数を直接呼び出さないが、 それらはHurdのファイルシステムやネットワーキング支援ライブラリのいくらか で使われている。libiohelplibthreadsを必要とする。


5.1.1 I/Oのユーザ

ほとんどのI/Oサーバはある種のユーザ認証確認を実装する必要がある。その過 程を容易にするために、単一のstruct iouserにidvecの組(FIXME: xref to C library)を要約するいくつかの関数を持つ。

Function: struct iouser * iohelp_create_iouser (struct idvec *uids, struct idvec *gids)

指定されたuidsgidsに対し新しいiouserを生成する。

Function: struct iouser * iohelp_dup_iouser (struct iouser *iouser)

iouserの複製を返す。

Function: void iohelp_free_iouser (struct iouser *iouser)

iouserへの参照を解放する。

I/O再認証は信頼される第三者として認証サーバを伴ういくぶん複雑なプロト コルである (see section Auth Protocol)。駄目な実装の危険性を減らすために、 I/O再認証はiohelp_reauth関数に要約されている。

Function: struct iouser * iohelp_reauth (auth_t authserver, mach_port_t rend_port, mach_port_t newright, int permit_failure)

再認証の処理を管理し、新しいiouserを返す。authserverはI/Oサー バの認証portである。ユーザによって提供される待ち合わせのportは rend_portである。

もし処理が完了できなければ、permit_failureが非ゼロでなければゼロを 返す。もしpermit_failureが非ゼロで、処理が失敗したなら、識別子を持 たないiouserを返す。ユーザに送られる新しいportはnewrightであ る。


5.1.2 conchの管理

conchは共有メモリI/Oサブシステムの心臓部にある。いくつかのHurdライ ブラリは共有I/Oを実装し、だからlibiohelpはconch管理を容易にする関 数を含む。

共有I/Oに関するものはどれでも解説されていない。なぜなら、それは十分な性 能には必要なく、RPCインターフェースはもっと単純だからだ (see section I/O Interface)。新しいライブラリやサーバが共有I/Oを実装するのは役に立たない。


5.2 ページャ・ライブラリ

外部ページャ (XP)マイクロカーネル・インターフェースはハード ウェアのページ・フォールトをRPCリクエストに変換することによって、アプリ ケーションがメモリ・オブジェクトにbacking storeを提供できるようにする。 外部ページャはmemory-mapped I/O(see section Mapped Data)とstored filesystem (see section Stored(FIXME-J:ストアード?)ファイルシステム)に必要とされる。

外部ページャのインターフェースは非常に複雑なので、Hurdページャ・ライブラ リはマルチスレッド化された外部ページャを作ることを目的とする関数を含む。 libpager<hurd/pager.h>で宣言され、スレッドとportライブラ リだけを必要とする。


5.2.1 ページャの管理

ページャ・ライブラリはマルチスレッド化されたページャを実現するために struct pagerデータ型を定義する。ページャを生成するための一般的な 手続きは、Pager Callbacksで列挙される関数を定義し、ページャがアク セスするportのためのlibports bucketを確保し、少なくとも一つの新し いstruct pagerpager_createで生成することである。

Function: struct pager * pager_create (struct user_pager_info *u_pager, struct port_bucket *bucket, boolean_t may_cache, memory_object_copy_strategy_t copy_strategy)

新しいページャを生成する。ページャは(libportsを使って、 bucketに)それのために生成されたportを持つようになり、直ちにリクエ ストを受け付ける準備が整うだろう。u_pagerはその後の pager_find_addressへの呼び出しに提供されるだろう。ページャは一つ のユーザ参照を生成させるだろう。may_cachecopy_strategymemory_object_readyに対するものと同じ、これらの属性の元の値である。 ユーザは関連したportのライブラリ関数を使用してページャへの参照を生成して よい。エラーでnullを返し、errnoを設定する。

制御をページャ・ライブラリに引き渡す準備が整うと、pager_demuxerを portのdemuxerとして使ってbucket上で ports_manage_port_operations_multithreadを呼び出すべきだ。これは 全ての外部ページャRPCを処理し、必要なとき、あなたのページャコールバック を起動するだろう。

Function: int pager_demuxer (mach_msg_header_t *inp, mach_msg_header_t *outp)

ページャのportにやって来るlibportsメッセージをdemultiplexする。

以下の関数はページャ・ライブラリの本体であり、ページャの機能へのすっきり したインターフェースを提供する。

Function: void pager_sync (struct pager *pager, int wait)
Function: void pager_sync_some (struct pager *pager, vm_address_t start, vm_size_t len, int wait)

ページャpagerからそのbacking storeへデータを書き込む。waitが 設定されている場合に限り、その全ての書き込みが完了するまで待つ。

pager_syncは全てのデータを書き込む。pager_sync_somestartで始まるデータをlenバイトだけ書き込む。

Function: void pager_flush (struct pager *pager, int wait)
Function: void pager_flush_some (struct pager *pager, vm_address_t start, vm_size_t len, int wait)

カーネルからページャpagerのデータをフラッシュし、未処理の遅らされ たコピーを強制する。waitが設定されている場合に限り、全てのページが フラッシュされるまで待つ。

pager_flushは全てのデータをフラッシュする。 pager_flush_somestartで始まるデータをlenバイトだけ フラッシュする。

Function: void pager_return (struct pager *pager, int wait)
Function: void pager_return_some (struct pager *pager, vm_address_t start, vm_size_t len, int wait)

カーネルからページャpagerのデータをフラッシュし、未処理の遅らされ たコピーを強制する。waitが設定されている場合に限り、全てのページが フラッシュされるまで待つ。カーネルに修正をwrite backさせる。

pager_returnは全てのデータをフラッシュして復元する。 pager_return_somestartで始まるデータをlenバイトだけ フラッシュして復元する。

Function: void pager_offer_page (struct pager *pager, int precious, int writelock, vm_offset_t page, vm_address_t buf)

データのページをカーネルに提供する。preciousが設定されていると、こ のページはいつか将来にページ・アウトされ、そうでなければカーネルによって 外されるかもしれない。もしそのページが現在コアにあると、カーネルはこの呼 び出しを無視するかもしれない。

Function: void pager_change_attributes (struct pager *pager, boolean_t may_cache, memory_object_copy_strategy_t copy_strategy, int wait)

ページャpagerの土台となるメモリ・オブジェクトの属性を変更する。 may_cachecopy_strategy引数は memory_object_change_attributesに対するものと同様である。 waitが設定されている場合に限り、カーネルが完了を報告するまで待つ。

Function: void pager_shutdown (struct pager *pager)

ページャの終了を強制する。これが返った後、ページャへのページング・リクエ ストはもはや受理されず、ページャは解放されるだろう。最初に完了する現在未 処理のページング・リクエストがあるなら、本当の解放は非同期的に起きるだろ う(6)

Function: error_t pager_get_error (struct pager *p, vm_address_t addr)

Return the error code of the last page error for pager p at address addr.(7)

Function: error_t pager_memcpy (struct pager *pager, memory_object_t memobj, vm_offset_t offset, void *other, size_t *size, vm_prot_t prot)

Try to copy *size bytes between the region other points to and the region at offset in the pager indicated by pager and memobj. If prot is VM_PROT_READ, copying is from the pager to other; if prot contains VM_PROT_WRITE, copying is from other into the pager. *size is always filled in the actual number of bytes successfully copied. Returns an error code if the pager-backed memory faults; if there is no fault, returns zero and *size will be unchanged.

These functions allow you to recover the internal struct pager state, in case the libpager interface doesn't provide an operation you need:

Function: struct user_pager_info * pager_get_upi (struct pager *p)

Return the struct user_pager_info associated with a pager.

Function: mach_port_t pager_get_port (struct pager *pager)

Return the port (receive right) for requests to the pager. It is absolutely necessary that a new send right be created from this receive right.


5.2.2 Pager Callbacks

Like several other Hurd libraries, libpager depends on you to implement application-specific callback functions. You must define the following functions:

Function: error_t pager_read_page (struct user_pager_info *pager, vm_offset_t page, vm_address_t *buf, int *write_lock)

For pager pager, read one page from offset page. Set *buf to be the address of the page, and set *write_lock if the page must be provided read-only. The only permissable error returns are EIO, EDQUOT, and ENOSPC.

Function: error_t pager_write_page (struct user_pager_info *pager, vm_offset_t page, vm_address_t buf)

For pager pager, synchronously write one page from buf to offset page. In addition, vm_deallocate (or equivalent) buf. The only permissable error returns are EIO, EDQUOT, and ENOSPC.

Function: error_t pager_unlock_page (struct user_pager_info *pager, vm_offset_t address)

A page should be made writable.

Function: error_t pager_report_extent (struct user_pager_info *pager, vm_address_t *offset, vm_size_t *size)

This function should report in *offset and *size the minimum valid address the pager will accept and the size of the object.

Function: void pager_clear_user_data (struct user_pager_info *pager)

This is called when a pager is being deallocated after all extant send rights have been destroyed.

Function: void pager_dropweak (struct user_pager_info *p)

This will be called when the ports library wants to drop weak references. The pager library creates no weak references itself, so if the user doesn't either, then it is alright for this function to do nothing.


5.3 I/O Interface

The I/O interface facilities are described in <hurd/io.defs>. This section discusses only RPC-based I/O operations.(8)


5.3.1 I/O Object Ports

The I/O server must associate each I/O port with a particular set of uids and gids, identifying the user who is responsible for operations on the port. Every port to an I/O server should also support either the file protocol (see section File Interface) or the socket protocol (see section Socket Interface); naked I/O ports are not allowed.

In addition, the server associates with each port a default file pointer, a set of open mode bits, a pid (called the "owner"), and some underlying object which can absorb data (for write) or provide data (for read).

The uid and gid sets associated with a port may not be visibly shared with other ports, nor may they ever change. The server must fix the identification of a set of uids and gids with a particular port at the moment of the port's creation. The other characteristics of an I/O port may be shared with other users. The I/O server interface does not generally specify in what way servers may share these other characteristics are shared (with the exception of the deprecated O_ASYNC interface); however, the file and socket interfaces make further requirements about what sharing is expected and prohibited from occurring.

In general, users get send rights to I/O ports by some mechanism that is external to the I/O protocol. (For example fileservers give out I/O ports in response to the dir_lookup and fsys_getroot calls. Socket servers give out ports in response to the socket_create and socket_accept calls.) However, the I/O protocol provides methods of obtaining new ports that refer to the same underlying object as another port. In response to all of these calls, all underlying state (including, but not limited to, the default file pointer, open mode bits, and underlying object) must be shared between the old and new ports. In the following descriptions of these calls, the term "identical" means this kind of sharing. All these calls must return send rights to a newly-constructed Mach port.

The io_duplicate call simply returns another port which is identical to an existing port and has the same uid and gid set.

The io_restrict_auth call returns another port, identical to the provided port, but which has a smaller associated uid and gid set. The uid and gid sets of the new port are the intersection of the set on the existing port and the lists of uids and gids provided in the call.

Users use the io_reauthenticate call when they wish to have an entirely new set of uids or gids associated with a port. In response to the io_reauthenticate call, the server must create a new port, and then make the call auth_server_authenticate to the auth server. The rendezvous port for the auth_server_authenticate call is the I/O port to which was made the io_reauthenticate call. The server provides the rend_int parameter to the auth server as a copy from the corresponding parameter in the io_reauthenticate call. The I/O server also gives the auth server a new port; this must be a newly created port identical to the old port. The authserver will return the set of uids and gids associated with the user, and guarantees that the new port will go directly to the user that possessed the associated authentication port. The server then identifies the new port given out with the specified ID's.


5.3.2 Simple Operations

Users write to I/O ports by calling the io_write RPC. They specify an offset parameter; if the object supports writing at arbitrary offsets, the server should honour this parameter. If -1 is passed as the offset, then the server should use the default file pointer. The server should return the amount of data which was successfully written. If the operation was interrupted after some but not all of the data was written, then it is considered to have succeeded and the server should return the amount written. If the port is not an I/O port at all, the server should reply with the error EOPNOTSUPP. If the port is an I/O port, but does not happen to support writing, then the correct error is EBADF.

Users read from I/O ports by calling the io_read RPC. They specify the amount of data they wish to read and the offset. The offset has the same meaning as for io_write above. The server should return the data that was read. If the call is interrupted after some data has been read (and the operation is not idempotent) then the server should return the amount read, even if less than the amount requested. The server should return as much data as possible, but never more than requested by the user. If there is no data, but there might be later, the call should block until data becomes available. Indicate end-of-file conditions by returning zero bytes. If the call is interrupted after some data has been read, but the call is idempotent, then the server may return EINTR rather than actually filling the buffer (taking care that any modifications of the default file pointer have been reversed). Preferably, however, servers should return data.

There are two categories of objects: seekable and non-seekable. Seekable objects must accept arbitrary offset parameters in the io_read and io_write calls, and to implement the io_seek call. Nonseekable objects must ignore the offset parameters to io_read and io_write, and should return ESPIPE to the io_seek call.

On seekable objects, io_seek changes the default file pointer for reads and writes. (See (libc)File Positioning section `File Positioning' in The GNU C Library Reference Manual, for the interpretation of the whence and offset arguments.) It returns the new offset as modified by io_seek.

The io_readable interface returns the amount of data which can be immediately read. For the special technical meaning of "immediately", see Asynchronous I/O.


5.3.3 Open Modes

The server associates each port with a set of bits that affect its operation. The io_set_all_openmodes call modifies these bits and the io_get_openmodes call returns them. In addition, the io_set_some_openmodes and io_clear_some_openmodes do an atomic read/modify/write of the openmodes.

The O_APPEND bit, when set, changes the behaviour of io_write when it uses the default file pointer on seekable objects. When io_write is done on a port with the O_APPEND bit set, is must set the file pointer to the current file size before doing the write (which would then increment the file pointer as usual). The current file size is the smallest offset which returns end-of-file when provided to io_read. The server must atomically bind this update to the actual data write with respect to other users of io_read, io_write, and io_seek.

The O_FSYNC bit, when set, guarantees that io_write will not return until data is fully written to the underlying medium.

The O_NONBLOCK bit, when set, prevents read and write from blocking. They should copy such data as is immediately available. If no data is immediately available they should return EWOULDBLOCK.

The definition of "immediately" is more-or-less server-dependent. Some servers, notably stored filesystem servers (see section Stored(FIXME-J:ストアード?)ファイルシステム), regard all data as immediately available. The one criterion is that something which must happen immediately may not wait for any user-synchronizable event.

The O_ASYNC bit is deprecated; its use is documented in the following section. This bit must be shared between all users of the same underlying object.


5.3.4 Asynchronous I/O

Users may wish to be notified when I/O can be done without blocking; they use the io_async call to indicate this to the server. In the io_async call the user provides a port on which will the server should send sig_post messages as I/O becomes possible. The server must return a port which will be the reference port in the sig_post messages. Each io_async call should generate a new reference port. (FIXME: xref the C library manual for information on how to send sig_post messages.)

The server then sends one SIGIO signal to each registered async user everytime I/O becomes possible. I/O is possible if at least one byte can be read or written immediately. The definition of "immediately" must be the same as for the implementation of the O_NONBLOCK flag (see section Open Modes). In addition, every time a user calls io_read or io_write on a non-seekable object, or at the default file pointer on a seekable object, another signal should be sent to each user if I/O is still possible.

Some objects may also define "urgent" conditions. Such servers should send the SIGURG signal to each registered async user anytime an urgent condition appears. After any RPC that has the possibility of clearing the urgent condition, the server should again send the signal to all registered users if the urgent condition is still present.

A more fine-grained mechanism for doing async I/O is the io_select call. The user specifies the kind of access desired, and a send-once right. If I/O of the kind the user desires is immediately possible, then the server should return so indicating, and destroy the send-once right. If I/O is not immediately possible, the server should save the send-once right, and send a select_done message as soon as I/O becomes immediately possible. Again, the definition of "immediately" must be the same for io_select, io_async, and O_NONBLOCK (see section Open Modes).

For compatibility with 4.2 and 4.3 BSD, the I/O interface provides a deprecated feature (known as icky async I/O). The calls io_mod_owner and io_get_owner to set the "owner" of the object, providing either a pid or a pgrp (if the value is negative). This implies that only one process at a time can do icky I/O on a given object. Whenever the I/O server is sending sig_post messages to all the io_async users, if the O_ASYNC bit is set, the server should also send a signal to the owning pid/pgrp. The ID port for this call should be different from all the io_async ID ports given to users. Users may find out what ID port the server uses for this by calling io_get_icky_async_id.


5.3.5 Information Queries

Users may call io_stat to find out information about the I/O object. Most of the fields of a struct stat are meaningful only for files. All objects, however, must support the fields st_fstype, st_fsid, st_ino, st_atime, st_atime_usec, st_mtime_user, st_ctime, st_ctime_usec, and st_blksize.

st_fstype, st_fsid, and st_ino must be unique for the underlying object across the entire system.

st_atime and st_atime_usec hold the seconds and microseconds, respectively, of the system clock at the last time the object was read with io_read.

st_mtime and st_mtime_usec hold the second and microseconds, respectively, of the system clock at the last time the object was written with io_write.

Other appropriate operations may update the atime and the mtime as well; both the file and socket interfaces specify such operations.

st_ctime and st_ctime_usec hold the seconds and microseconds, respectively, of the system clock at the last time permanent meta-data associated with the object was changed. The exact operations which couse such an update are server-dependent, but must include the creation of the object.

The server is permitted to delay the actual update of these times until stat is called; before the server stores the times on permanent media (if it ever does so) it should update them if necessary.

st_blksize gives the optimal I/O size in bytes for io_read and io_write; users should endeavor to read and write amounts which are multiples of the optimal size, and to use offsets which are multiples of the optimal size

In addition, objects which are seekable should set st_size to the current file size as in the description of the O_APPEND flag (see section Open Modes).

The st_uid and st_gid fields are unrelated to the "owner" as described above for icky async I/O.

Users may find out the version of the server they are talking to by calling io_server_version; this should return strings and integers describing the version number of the server, as well as its name.


5.3.6 Mapped Data

Servers may optionally implement the io_map call. The ports returned by io_map must implement the external pager kernel interface (see section ページャ・ライブラリ) and be suitable as arguments to vm_map.

Seekable objects must allow access from zero up to (but not including) the current file size as described for O_APPEND (see section Open Modes). Whether they provide access beyond such a point is server-dependent; in addition, the meaning of accessing a non-seekable object is server-dependent.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Akihiro Sagawa on June, 15 2005 using texi2html 1.70.