Name
Synopsis
Description
Programming Interface
IOV flags
Fundamentals
REQUEST-REPLY IPC
RECEIVE-SEND IPC
Administration
Warnings
Examples
Client code
RECEIVE-SEND IPC
Server code
REQUEST-REPLY IPC
(IPC_SHARED)
Server code
Client code
See also
Name
Synopsis
Description
Programming Interface
User process
Aliases
Warnings
Examples
See also
Name
Synopsis
Description
Programming Interface
Administration
Warnings
Examples
See also
Name
Synopsis
Description
Functions
Example
See Also
Machine
dependent kernel functions- i386 architecture
I/O Port access routines
I/O port management
Interrupt handling
Interrupt ISR management
IRQ auto detection
Timers and Time
Memory management code
DMA operation
PCI management
Name
Synopsis
Description and programming
interface
Packet receiving and
delivering
Packet queue macros
Examples
See also
Date: Sun Aug 12 14:58:03 GMT 2001
- 2 - |
IPC:
High performance Amoeba Kernel Interprocess communication module for
the local |
#include <sys/ipc.h> typedef struct portid |
portid_t,*portid_p; |
|||
typdef struct ipc_message |
ipc_msg_t,*ipc_msg_p; |
errstat ipc_register(portid_t
*portid, errstat ipc_unregister(portid_t *portid); errstat ipc_lookup(portid_p
portid, errstat ipc_trans(ipc_msg_p
msg_req, errstat ipc_getreq(ipc_msg_p
msg, errstat ipc_putrep(ipc_msg_p
msg, errstat ipc_send(ipc_msg_p
msg, errstat ipc_recv(ipc_msg_p
msg, errstat ipc_create_shared(capability *segcap, errstat ipc_map_shared(capability
*origcap, |
- 3 -
These
functions will appear first in the new Vertex-Amoeba kernel for fast
and efficient communication between privileged system processes like
device drivers. It's definitely not intented to be a replacement for Amoeba's generic RPC
interface. Most servers on top of the kernel will furthermore use the
generic RPC interface for their default interprocess communication to
insure local independency for processes- one main goal of Amoeba. But
there are cases, the RPC interface is to slow, or not flexible
enough, and for example for device driver running in user space, the
wrong choice. Device drivers are servers, too, but running always
local on a specific machine. To support outsourcing of device
drivers, protocol stacks or other less time crticial servers the new
IPC interface was introduced. All servers inside the kernel and
various new parts will support both interfaces: RPC and IPC. The IPC
stub functions and data structures were specified very similar to the
RPC ones. To make efficient transfer of complex data possible, there
is significant performance improvement with transferring data vector
objects. That means several buffers of different location and length
can be transferred with one transaction or reply. |
Data structures struct ipc_message{ portid_t portid; /* unique destination portid |
*/ |
|
/* long command; /* request command */ }; struct ipc_iovec{ |
*/ |
|
}; IOV macros IPC_IOVLEN(iov) Get the length of the iovec entry IPC_IOVFLAGS(iov) Get the flags of the iovec entry IPC_IOVBUF(iov) |
- 4 - Get the buffer addres of the iovec entry IPC_SETIOVLEN(iov,len) Set the length field of the iovec entry IPC_SETIOVFLAGS(iov,flags) Set the flags of the iovec field IPC_SETIOV(iov,buf,len,flags) Set the buffer start address, the length and
the flags of the iovec |
IPC_COPY Copy data buffers between IPC partners IPC_SHARED Translate shared data segment addresses between IPC partners |
1. All stub
functions always return an error code, familar with error conditions found
in
the stderr.h header. This error code show the
success of the IPC operation, and is not
the status given by the server.
The server side status is returned in msg.status.
2. All, the
request and the reply buffers, must currently allocated by the user. For
vector
objects, each vector element
consists in the IPC_COPY case of a
user allocated
memory buffer (stack,
malloc,...), and the length specifies on the request side the
maximal size of the buffer, and
for the reply the indead returned data length. In the
IPC_SHARED case, both processes wanting to share virtual memory must create
a
shared segment . The client
iovec virtual base address is given on
the request side, and
the kernel IPC module will
translate the virtual addres from the client to the server side,
and vice versa. The server got a
translated iovec array.
3. If reply
data is present from the IPC, the iovec structure contains in his length field the
total replied data size in the
buffer.
4. For each
IPC port, there is a private key capability for IPC port protection, given
to
the IPC module with ipc_register. Only processes knowing the
public keyport
capability can lookup the IPC
port, else the lookup will be denied by the IPC
module.
5. Portids are
protected with a random generated private field. Only with ipc_lookup()
(and the right key cap) it's
possible to get this security field. Each time a new port is
registered, the private field
will be newly random generated.
Each server thread of a process wanting to receive messages from a specific port must register the ipc port with:
errstat
ipc_register(portid,portname,namelen,keycap) |
- 5 -
int |
namelen; |
The ipc module returns in portid
a unique port identification number related with the port name
given in portname with length namelen. The keycap
capability protects the port name against unauthorized usage.
A client process must have the same key capability (not only a restricted
version from this) to lookup this port. If the port was already registered,
always the same port id number will be returned.
Either on a server thread exit, or explicitly a registered port can be
unregistered
with:
errstat A client can translate a port name into a port id number with: errstat |
ipc_lookup(portname,plen,portid, protcap)
char |
*portname; |
The len field contains the length of the portname string. The protection capability protcap must be the same as registered by the server (see above).
The message passing can take place in two ways:
1. bi-directional in RPC request-reply semantic Client: ipc_trans 2. uni-directional only from client to server Client: ipc_send |
A client thread can send a request-reply message with:
errstat A message header consists of 2 parts: |
1. portid
2. private
data
- 6 -
Always both, request message header reqmsg, and the reply message header repmsg are required, but can (and for performance reasons should) be the same data structure. The request and reply data are packed in the io vector array iovec in the following way:
ipc_iovec_t iov[n];
IPC_SETIOV(iov[0],buf_1,buflen_1, [IPC_COPY] or [IPC_SHARED]);
... |
The reqiovlen and repiovlen fields contain the length of the iovec array iov, in this case n. For the case of non existing request and/or reply data, the length field must be zero.
On ther server side, there are two functions for the request and the reply:
errstat
ipc_getreq(reqmsg,reqiov,reqlen)
ipc_msg_p errstat |
reqmsg; |
ipc_putrep(repmsg,repiov,replen)
ipc_msg_p
repmsg;
ipc_iovec_p |
repiov; |
Always both function must be used to service a client request. The request and reply message and the iovec array can be the same data. To extract the buffer address, the length and perhaps the flags from the iovec structure, there exist several macros:
long len; /* len = IPC_IOVLEN(iov[i]); /* flags = IPC_IOVFLAGS(iov[i]); |
The more simple case of sending and receiving an ipc message:
errstat errstat |
ipc_recv(msg,iov,len) |
- 7 - |
The client sends a message with ipc_send, and the server waits for client messages with ipc_recv. The server must have previously registered his ipc port, or the receive will fail. The client must set the portid field in his msg structure to determine the server ipc port. The server portid must be retrieved with ipc_lookup. Shared Data segments To make IPC transfers much faster, client and server processes can share data segments. First, the server process creates a shareable data segment with errstat This function allocates the data segment of size
vsize, and returns in vaddr the virtual start address and
the segment capability segcap.
The data segment is mapped in the server process address space with
the MAP_SHARED flag. errstat The start address of the virtual mapped data segment is returned in vaddr. Usually, client processes send a request to the server to get the segment capability. All IPC transaction can use these shared data segments. In the iovec field, all processes must set the IPC_SHARED flag. However, each process uses his own virtual addresses for this shared data. The kernel IPC module is responsible to transform the virtual addresses during an IPC transaction between the processes. |
On thread exit, the IPC module will check if the dying thread was a registered ipc server thread. The IPC module will do the necessary cleanup and unregister automatically the allocated ipc port slot. If this thread was the last one owning a registered port, the port name and id number will vanish from the port name table. |
Things that
might unpleasantly surprise the user.
- 8 -
REQUEST-REPLY IPC (IPC_COPY) A simple client-server example. Server code #include <amoeba.h> #include <module/rnd.h> #define MYSERVER_PORTNAME "myserver" capability myportcap; void errstat err; /* rnd_getrandom(&myportcap,sizeof(capability); /* name_append(SERVERPATH,&myportcap); /* err = ipc_register(&portid,MYSERVER_PORTNAME,&myportcap); /* for(;;) /* IPC_SETIOV(iov[0],buf,100,IPC_COPY); msg.portid=portid; |
- 9 - err = ipc_getreq(msg,iov,1); if(err != STD_OK) replylen=0; switch(msg.command) case CMD_HELLO: msg.status = STD_OK, strcpy(buf,"I've got the data.\n"); IPC_SETIOV(iov[0],buf, replen = 1 ; break; default: err = ipc_putrep(&msg,iov,replen); |
|||||
} |
|||||
} |
#include <amoeba.h> #include <module/name.h> #define MYSERVER_PORTNAME "myserver" |
"myserver" |
|
#define SERVERPATH "/tmp/myserver" #define CMD_HELLO 100 capability myportcap; portid_t portid; void char buf[100]; |
"/tmp/myserver" 100 |
- 10 -
char *p;
int /* |
len; |
**
Send a message to the server /* name_lookup(SERVERPATH,&myportcap); err = ipc_lookup(&portid,MYSERVER_PORTNAME,&myportcap); if(err != STD_OK) strcpy(buf,"Hello, I'm there\n"); /* IPC_SETIOV(
iov[0], msg.portid = portid; /* err = ipc_trans(&msg,iov,1,&msg,iov,1); if(err != STD_OK || msg.status != STD_OK) /* p = (char *) IPC_IOVBUF(iov[0]); if(len > 0) |
|||
} |
This code sample was taken from the ipc test suite program.
myserver.h: |
- 11 - |
#define
PNMAE "MYSERVER" #include "myserver.h" capability servercap; void server() |
portid_t /* |
pid; port protection capability. We are not allowed |
** to use the NULL capability!
*/ rnd_getrandom((char *)&servercap,sizeof(capability)); /* sprintf(path,"/tmp/%s",PNAME); name_delete(path); if(err!=STD_OK) /* str[0]=(char *)malloc(100); if(str[0]==NULL || str[1]==NULL ||
str[2]==NULL) /* err = ipc_register(&pid,PNAME,sizeof(PNAME),&servercap); |
- 12 -
if(err!=STD_OK)
{ |
|||
printf("ipc_server: can't register IPC port: %s\n", |
|||
} /* |
** The server loop */ for(n=0;n<100;n++) { |
/* *str[0]='\0'; IPC_SETIOV(req_iov[0],str[0],100,IPC_COPY); stra=(char *)malloc(300); /* req_msg.portid=pid; /* err = ipc_recv(&req_msg,req_iov,4); if(err!=STD_OK) /* if(req_msg.command!=
PCMD) /* |
- 13 -
**
Concatenate all strings, loop through the iovec list:
*/
*stra='\0';
for(i=0;i<3;i++)
{ |
||||
if(IPC_IOVLEN(req_iov[i])==0) |
||||
length\n", |
||||
i); |
||||
} |
if(strlen(stra)!=req_msg.size) |
|||
%d\n", |
|||
strlen(stra),req_msg.size); if(strlen(str4)!=req_msg.size) |
stra\n");
} |
exit(1); |
free(stra); |
|||||
} |
|||||
} Client code |
#include "myserver.h" void client() |
pid; |
|
ipc_msg_t req_msg; errstat err; /* |
reply buffers, the first from the stack (str1), |
- 14 - ** the rest from the heap str2=(char *)malloc(100); if(str2==NULL || str3==NULL) |
printf("ipc_client: can't allocate buffers\n"); |
|||
} |
** Lookup the prot cap in the file system
*/
sprintf(path,"/tmp/%s",PNAME);
err = name_lookup(path,&servercap);
if(err!=STD_OK)
{ |
||
printf("ipc_client: can't lookup port prot cap %s:
%s\n", |
||
} /* |
**
Lookup the IPC port err = ipc_lookup(&pid,PNAME,sizeof(PNAME),&servercap); if(err!=STD_OK) |
printf("ipc_client: can't lookup IPC port: %s\n", |
|||
} /* |
** The client loop */ for(n=0;n<100;n++) { |
/* sprintf(str1,"This is part %d.",n*n*n); stra=(char *)malloc(300); sprintf(stra,"%s%s%s",str1,str2,str3); IPC_SETIOV(req_iov[0],str1,strlen(str1)+1,IPC_COPY); |
/* |
- 15 - |
**
Setup the request message structure req_msg.portid=pid; /* err = ipc_send(&req_msg,req_iov,4); if(err!=STD_OK) free(stra); |
||||||
} |
||||||
} |
myserver.h: #define PNAME "Myserver" #include "myserver.h" void server() /* rnd_getrandom((char *)&servercap,sizeof(capability)); |
/* |
- 16 - |
**
publish the cap in the file system sprintf(path,"/tmp/%s",PNAME); name_delete(path); if(err!=STD_OK) |
printf("ipc_server: can't append port prot cap %s:
%s\n", |
|||
} /* |
**
Register the IPC port err = ipc_register(&pid,PNAME,sizeof(PNAME),&servercap); if(err!=STD_OK) |
printf("ipc_server: can't register IPC port: %s\n", |
|||
} /* |
**
Create shareable data segment err = ipc_create_shared(&segcap,(long *)&segbuf,SEG_SIZE); if(err != STD_OK) |
printf("ipc_server: can't create shared segment:
%s\n", |
|||
} /* |
**
The server loop for(n=0;n<11;n++) |
rep_len = 0; /* bufa=(char *)malloc(300); IPC_SETIOV(req_iov[0], IPC_SETIOV(req_iov[1], |
- 17 -
/* |
0, |
**
Setup the request message structure req_msg.portid=pid; /* err = ipc_getreq(&req_msg,req_iov,2); if(err!=STD_OK) |
printf("ipc_server #5: ipc_getreq failed: %s\n", |
|||
} /* |
**
The right command ? if(req_msg.command ==
CMD_CAP) |
} |
IPC_SETIOV(rep_iov[0], rep_len=1; |
else if (req_msg.command == CMD_BUF)
{ |
||
/* if(IPC_IOVLEN(req_iov[1])!=req_msg.size) if(strcmp((char
*)IPC_IOVBUF(req_iov[1]),bufa)!=0) /* send the received shared data region back again */ IPC_SETIOV(rep_iov[0], rep_msg.size=strlen(bufa)+1; |
rep_len=1; |
- 18 - |
} |
printf("ipc_server: invalid command (%d)\n", |
rep_msg.status=STD_OK; /* err = ipc_putrep(&rep_msg,rep_iov,rep_len); if(err!=STD_OK) free(bufa); |
|||
} err |
= ipc_unregister(&pid); |
if(err != STD_OK) |
|||
} |
#include "myserver.h" void client() char *segbuf; dummy=(char *)malloc(1000000); /* |
- 19 -
** Lookup the prot cap in the file system
*/
sprintf(path,"/tmp/%s",PNAME);
err = name_lookup(path,&servercap);
if(err!=STD_OK)
{ |
||
printf("ipc_client: can't lookup port prot cap %s:
%s\n", |
||
} /* |
**
Lookup the IPC port err = ipc_lookup(&pid,PNAME,sizeof(PNAME),&servercap); if(err!=STD_OK) |
printf("ipc_client: can't lookup IPC port: %s\n", |
|||
} /* |
**
Setup reply iovec (cap) IPC_SETIOV(rep_iov[0],&segcap,sizeof(capability),IPC_COPY); /* req_msg.portid=pid; err = ipc_trans(&req_msg,NULL,0,&rep_msg,rep_iov,1); if(err!=STD_OK) |
} |
printf("ipc_client: 1. ipc_trans failed: %s\n", |
if(rep_msg.status!=STD_OK)
{ |
|||
printf("ipc_client: 1. ipc_trans: server reply:
%s\n", |
|||
} |
if(IPC_IOVLEN(rep_iov[0])!=sizeof(capability))
{ |
|||
printf("ipc_client: server returned invalid cap\n"); |
|||
} |
bufsize=rep_msg.size;
/*
** Map in the shared segment
*/
- 20 -
err = ipc_map_shared(&segcap,(long *)&segbuf,bufsize);
if(err!=STD_OK)
{ |
||
printf("ipc_cleint: can't map shared data segment:
%s\n", |
||
} /* |
**
The client loop for(n=0;n<10;n++) |
/* offset within the shared data segment */ str=&segbuf[(n+1)*10]; sprintf(str,"Hallo %d %x \n",n*n*n,rand()); IPC_SETIOV(req_iov[0],str,strlen(str)+1,IPC_COPY); /* req_msg.portid=pid; /* err = ipc_trans(&req_msg,req_iov,2, if(err!=STD_OK) if(rep_msg.status!=STD_OK) /* if(IPC_IOVLEN(rep_iov[0])!=strlen(str)+1) |
- 21 - printf("ipc_client #1: invalid reply [1] len: %d, expected |
|||||
%d\n", |
|||||
} |
IPC_IOVLEN(rep_iov[0]), |
if(strcmp(str,(char
*)IPC_IOVBUF(rep_iov[0]))!=0) } free(dummy); } |
isr[K]
- 22 - |
1. Kernel I/O port and management routines.
2. User process I/O port and mapping routines Machine dependency: i386, ISA, PCI, VLB |
#include <i386_proto.h> void out_byte(int _port, int
_val); int in_byte(int _port); void outs_byte(int _port, char
* _ptr, int _bytecnt); Kernel: void request_region(unsigned
int from, int check_region(unsigned int from, unsigned int extent); int release_region(unsigned int from, unsigned int extent); User process: errstat io_check_region(unsigned
int start, errstat io_map_region(unsigned
int start, errstat io_unmap_region(unsigned
int start, |
- 23 - capability *syscap); |
||||
errstat io_vtop |
(long vaddr, |
Low
level input and output routines for hardware port access. Different
types are supported. See |
I/O port access GCC: all I/O port instructions are compiled inline. Write a byte, word or long value _val to a hardware I/O port with address _port: void out_byte(int _port, int _val); Read a byte, word or long value _val from a hardware I/O port from address _port: int in_byte(int _port); There are some additional routines like the above, but with different I/O timing (pausing/short delay ) for old buggy buses and interface cards: out_##_p() Output a string (memory area) of bytes, words, or longs, starting at address _ptr, of length _bytecnt (in bytes units), of length _wordcnt (in word units) , or of length _longcnt (in long units) to the I/O port with address _port void outs_byte(int _port, char * _ptr, int _bytecnt) Read a memory area (bytes,words or longs) to the start address _ptr from the I/O port from address _port, in byte, word or long counts: void ins_byte(int _port, char * _ptr, int _bytecnt) I/O port management Kernel |
- 24 -
To avoid overlaps of used I/O ports allocate a region in I/O space for a device driver with the request_region method:
void request_region(unsigned int from,
unsigned int extent, |
from determine the start adress, extent determine the size of the region and name is the (short) driver name.
Call check_region() before probing for your hardware.Once you have found your hardware, register it with request_region(). The return value is < 0 if the port region is already allocated, else >= 0.
int
check_region(unsigned int
from,
unsigned int
extent)
If you
unload/close the driver (?), or you don't need the port int release_region(unsigned
int from, |
access anymore, |
use |
#include <sys/iomap.h>
A user process can use I/O port access functions, if either I/O ports are mapped in the process, or the I/O privilege level IOPL for the process is set to level 0.
To probe a port region first ,there is a library stub function
errstat
io_check_region(start, size, syscap)
unsigned int
start;
unsigned int size; |
This function returns either STD_EXISTS for the case, the I/O port region in the range [start ... start+size] is already used by another module, or STD_OK, if the port region is ununsed. The syscap capability must be the kernel host (root dir) capability, or the request will be denied.
To map in the port region, use
errstat
io_map_region(start,size,devname,syscap)
unsigned int start;
unsigned int
size;
char *devname; |
In addition to io_check_region, a device name string must be supplied. To unmap a port region,
use
errstat |
- 25 - |
io_unmap_region(start,size,syscap) A process can get full I/O access with changing the IOPL: errstat Note: If the GNU gcc compiler is used, don't forget to enable optimization with the -O option, or you're not able to compile I/O port instructions inline ! To get the physical address from a user process virtual address, use: errstat This funcion needs teh virtual address vaddr and the length of the memory region vsize, and returns the physical translated address paddr. The memory region can be a mapped hardware segment, too. |
For backward compatibility. #define outb(a,b) |
out_byte(b,a) |
|||
#define outw(a,b) |
out_word(b,a) |
Things that might unpleasantly surprise the user.
- 26 -
User Process I/O #include <sys/iomap.h> #define KERNELSYSCAP "/super/hosts/mymach" int main() /*
lookup the system capability */ err=io_checkregion(0x278,4,&syscap); if(err!=STD_OK) err = io_map_region(0x278,4,MYNAME,&syscap); out_byte(0x278,0ff); ... io_unmap_region(0x278,4,&syscap); |
ipc[K]
, isr[K]
- 27 - |
ISR - user device interrupt service routine handler management
#include <sys/isr.h> typedef struct isr_handler isr_handler_t,*isr_handler_p; errstat interrupt_register(isr_handler_p isr,capability *syscap); errstat interrupt_await(int irq,
event ev); #define ISR_SERVER_PORT"sys::isr-server" |
In addition to the new Interprocess communication module IPC, the ISR module allows (privileged) user processes to service hardware device interrupts. To get this service, the process must send an ipc message via ipc_trans to the system isr server registered with the portname sys::isr-server. The message request data holds the user isr handler structure with the desired isr settings. On the ipc_trans reply, the isr handler will be returned with some additional settings, like the isr id number. There are some library stub functions for this purpose. After the ISR was successfull registered, the user ISR must call the interrupt_await function. If the hardware device triggers this irq event, the kernel ISR scheduler will wakeup the ISR handler. The isr handler can now service his irq. When finished, the isr handler must call the interrupt_done function to acknowledge the irq. The irq kept locked untill this functions is called. Keep this in mind. Interrupt sharing is possible. If several ISR handler, perhaps in different processes share one irq line, they will all be waked up on the irq event. The ISR server port is protected currently with the kernel root/host capability. The process can lookup this host capability in the /super/hosts directory. |
To register a user isr handler, use the interrupt_register library stub function: errstat The following fields in the isr handler must be set: |
isr.irq = <myirq number>; |
- 28 - |
isr.flags = IRQ_NORMAL or IRQ_SHARED; The system capability is the kernel host capability
(see above). errstat The isr handler struct must be the same as returned
by interrupt_register. errstat Irq is the registered irq number, and ev is the event id number to wait for, returned by the register transaction in the isr.ev field. To acknowledge an interrupt, use errstat For example: for(;;) ... service interrupt ... status = [IRQ_SERVICED] or [IRQ_UNKNOWN]; err = interrupt_done(isr.irq,status); } Note: The interrupt service thread itselfes must register the ISR, and not any other thread of the process! |
If a
thread, previously registered an isr handler, exits, the ISR handler
will be automatically |
Interrupt handler, regardless in kernel or user space, can corrupt the system. You process needs full I/O privileges, and can therefore corrupt your hardware. Furthermore, wrong arguments or usage of the interrupt_await/done functions can cause trouble concerning the irq handling. |
- 29 -
capabiliyt syscap; void isr_handler_tisr; name_lookup(HOST_PATH,&syscap); isr.irq=3; err = interrupt_register(&isr,&syscap); if(err!=STD_OK) for(;;) .. service the irq .. stat = interrupt_done(isr.irq,IRQ_SERVICED); } |
ipc[K]
ioport[K]
- 30 - |
kthread - kernel thread creation and thread memory management module
#include <sys/kthread.h> int thread_newthread(func,
stsize, param); void thread_switch(); #ifdef KERNEL_GLOCALS int thread_await(eventnum,
timeoutval); |
The thread module provides the programming interface to create, destroy and manage concurrent threads. Each thread can start executing a separate routine; they do not all have to execute the same function. Each thread in a multi-threaded process shares the same address space. It has its own stack and program counter but otherwise shares the text, data and bss segments of the process. Because it is sometimes useful to have data global within a thread but not accessible outside the thread, glocal data is provided. See the description of thread_alloc below for details of how to allocate and use glocal data. The threads are currently scheduled non-preemptively by default. Preemptive scheduling must be enabled explicitly using thread_enable_preemption(see thread_scheduling(L)). It is important to protect accesses to global data with mutexes (see mutex(L)). It is possible for a thread to request that it be rescheduled using threadswitch (see thread_scheduling(L)). This can be very useful in the presence of non-preemptive scheduling. NB. If a program is multi-threaded it is not safe to use UNIXemulation routines in more than one thread of the program. UNIXis not multi-threaded and therefore the emulation is only likely to be correct if confined to a single thread of a program. For example, a program with several threads where one is in read waiting for input from a terminal and another does a fork may well hang until the read is satisfied. The exit routine has been modified to force a close on all descriptors, even if they are held by another thread and no guarantees are made on the correctness of any resulting input or output if exit is called in such circumstances. |
- 31 -
thread_newthread
int
thread_newthread(func, stsize, param)
void
(*func)();
int stsize; |
Thread_newthread spawns a new thread and starts it at function func. Thread_newthread allocates a thread stack of stsize bytes. Stsize must be at least 512 bytes or the calling program will be killed. Parameters can be passed to the new thread via the thread_newthread parameters param and psize. Param is a pointer to the data structure to pass, psize is the size of the data structure. Param must be allocated by a member of the malloc family (see malloc(L)) since the clean up when the thread exits will free this memory. Memory allocated using thread_alloc cannot be used!
When no parameters are passed, param must be a NULL-pointer and psize must be zero. Once the thread exits, the allocated stack and parameter area are freed.
The function func is called as follows:
void
(*func)(param)
long param;
If the called function returns, the thread exits.
Thread_newthread returns zero upon failures (insufficient memory or out of threads), otherwise the thread id number of the new created dthread is returned.
Note that not all threads are created equal. When a process
first starts it consists of one thread which starts the routine main(). If main returns then exit() is called which will terminate the
entire process immediately. If it is desired that main
terminate and the other threads continue then it must not
return but call thread_exit (described
below).
Typically, when a new thread is created the parent continues to execute until
it blocks. However, if preemptive scheduling is enabled (see thread_scheduling(L)) then the newly created
process will have the same priority as the current thread. This means at the
next event (such as an interrupt) the new thread may be
scheduled.
thread_exit
int
thread_exit()
- 32 -
Thread_exit stops the current thread. It then frees any glocal memory (allocated by thread_alloc/thread_realloc), the parameter area and the allocated stack before exiting. Thread_exit does not return. When the calling thread was a server thread, and it was still serving a client, the client will receive an RPC_FAILURE (see rpc(L)). If thread_exit is called in the last thread of a process, the process exits as well (see exitprocess(L)).
thread_switch
void |
Thread_switch calls the scheduler to perform a switch to another ready to run thread in the kernel, or if no one is runnable, a user thread.
thread_alloc char * |
int size;
The first time thread_alloc is called (with *index == 0), thread_alloc allocates glocal data of size bytes and returns the module reference number in *index. The allocated data is initialized to zero. The value of the function is a pointer to the glocal data. Successive calls to thread_alloc with the previously assigned module reference number result in returning a pointer to the previously allocated memory.
Thread_alloc returns a NULL-pointer on insufficient memory or when a successive call to thread_alloc has a different size parameter than in the original call. In this case, the already allocated memory is not modified or freed.
Consider an example. Suppose a function in a single-threaded program that uses a (static) global variable. For example,
static
long sum;
long add(x)
long x;
{ |
||||
sum += x; |
||||
} |
Now suppose this function must be used in a multi-threaded program, where the threads perform independent computations. This means a separate sum variable is required for each thread. If the number of threads is known in advance, the threads could be numbered and an array indexed by thread numbers could maintained (assuming a thread can find out its own thread number, which is not trivial unless a parameter is added to add for this purpose). In general, however, this is too complex. A simpler solution is to use thread_alloc.
#include "thread.h"
static int
ident;
- 33 -
long
add(x)
long x;
{ |
||||
long *p_sum; |
||||
}; |
Because there may be several functions in a program that need a block of glocal memory for private use, thread_alloc has an ident parameter that indicates the identity of the memory block (not the calling thread!). This must be the address of a global integer variable, statically initialized to zero.** This variable is used for thread_alloc's internal administration; its contents must never be touched by the calling function. Functions that need to use the same block of glocal data must use the same ident variable. For consistency, all calls using the same ident variable must pass the same block size. To change the block size, thread_realloc can be used.
thread_await
int
thread_await(eventnum,timeoutval)
event eventnum; |
Thread_await is used concurently with
the thread_wakeup routine. A thread
calling thread_await wants to wait for
an event denoted by a unique event number eventnum, for example an address of any global variable of a process. Either
another thread triggers the event eventnum with thread_wakeup or a timeout
occurs, and thread_await returns. On
wakeup, thread_await returns 0, else -1
(timeout, interrupted).
A zero timeout interval value indicates no
timeout.
thread_wakeup
void Thread_wakeup wakeups all threads waiting for event eventnum. Example: |
event myevent; int mythread_A() /* res=thread_await((event) &myevent, (interval) 0); if(res<0) ... |
} int mythread_B() |
- 34 - |
thread_wakeup((event) &myevent); |
||||
} Warnings |
The thread module uses the global variable _thread_local (see sys_newthread(L)) for its administration.
The following example shows thread creation and the use of glocal memory. Ref_nr is the module reference number. Each time the signal handler is activated, the glocal data is fetched which belongs to the running thread and this module. #define NTHREADS 5 void i
= *(int *) param; /*
Initially allocate memory and fetch module ref. number */ void |
- 35 -
} |
} *(int *)p = i; |
/* Do not wait for threads to exit */
thread_exit(); |
||
} |
rpc[L]
- 36 - Machine dependent parts for i386 |
* I/O
Port routines |
Header-Files: #include <i386_proto.h> Output byte, word or long data to a hardware port: void out_byte(int _port, int _val) Input byte, word or long data from a hardware port: int in_byte(int _port) Mofied versions with small time delay (for noisy (ISA) buses): out_##_p() with different
I/O timing Block transfer functions (string) Output byte, word or long strings in byte count or in word/long counts (_l): void outs_byte(int _port, char * _ptr, int _bytecnt) |
void outs_long_l |
- 37 - (int _port, char * _ptr, int _longcnt) |
-> count in longs !!! And the byte, word or long input string versions: void ins_byte(int _port, char * _ptr, int _bytecnt) Aliases (Linux prototypes): #define outb(a,b) out_byte(b,a) If you use the GNU gcc compiler, the I/O port
functions expands to asm inline macros, else for the case |
To avoid overlaps of used I/O ports allocate a region in I/O space for a device driver with the request_region method: void request_region(unsigned
int from, from determine the start adress, extent determine the size of the region and name is the (short) driver name. Call check_region() before probing for your hardware.Once you have found your hardware, register it with request_region(). The return value is < 0 if the port region is already allocated, else >= 0. int check_region(unsigned int
from, |
- 38 - If you unload the driver, use release_region to free ports (usually not necessary). int release_region(unsigned
int from, |
Header-Files: #include <i386_proto.h> General informations for the case a hardware device triggers an interrupt: * Hardware triggers IRQ ## * ISR
(Interrupt service routine) handler will be called with "Interrupts
disabled" and "IRQ ## *
Never call functions within the ISR which are used by other kernel
parts (most functions are * If
you need more kernel support, push a high level interrupt handler
with enqueue(). After the * It's allowed to enable interrupts again within ISR (but pay attention): save_flags(flags);sti(); Usefull functions: Disable all hardware interrupts for a limited time: void disable(void) (obsolete) Enable hardware interrupts again: void enable(void) (obsolete) Get the processor flag register: int get_flags(void) And restore the processor flag register again: |
- 39 -
void set_flags(int _flags) |
/* macro */ |
Disabling and enabling of interrupts in critical code fragments should be done this way: int flags; Enabling and disabling of a specific interrupt level: void enable_irq(unsigned int irq_nr) |
The
kernel keeps a registry of interrupt lines, similar to the registry
of I/O ports. If a driver want to install int request_irq(unsigned int irq_nr, With the argument list: irq_nr: handler: Supported flags: NULL, SA_NORMAL: Normal Interrupt devname: dev_id: |
- 40 - The return value is 0 for success, else negative for request failure (error code). It's not uncommon for the function to return -EBUSY to signal that another driver is already using the requested interrupt line. But interrupt lines can be shared with other devices, given the SA_SHIRQ flag. Only device drivers can share an interrupt line if they all call request_irq with the SA_SHIRQ flag! Remove the irq handler from the irq registry queue with free_irq(). Additional, if there is no other shared interrupt handler on this irq this function disables the interrupt. void free_irq(unsigned int irq, void *dev_id) |
The kernel offers a low-level facility for probing the interrupt number of a specific device. The facility consists of two functions: unsigned long probe_irq_on(void) This function returns a bitmask of unassigned interrupts. The driver must preserve the returned bitmask and pass it to probe_irq_off() later. After this call, the driver should arrange for its device to generate at least one interrupt. int probe_irq_off(unsigned long) After the device has requested an interrupt, the driver calls this function, passing as argument the bitmask previously returned by probe_irq_on(). probe_irq_off() returns the number of the interrupt that was issued after "probe_on". If no interrupts occurs, 0 is returned. |
Header-Files: Install a new timer handler with sweeper_set(): void sweeper_set( sweep |
- 41 -
arg |
|||
Argument given to timer handler |
|||
period |
|||
Timer period in milli seconds |
|||
once |
|||
If set, call timer handler only one time. If zero, the timer handler is called each period. |
Additional Linux implementation using sweeper_set():
Initialize timer struct (not really needed) with init_timer(), add a new timer (always only one time called) with add_timer(), and remove the timer with del_timer() (not really needed):
void
init_timer(struct
timer_list *timer) Header file: |
#include <linux/timer.h> struct timer_list{ To get the actual time in milli seconds (resolution 1 ms) from the hardware timer use: unsigned long getmilli() There is another, global time variable: |
unsigned long milli_uptime;
It delivers the actual time in milli seconds, but with time
resolution PIT_INTERVAL (pit.h), usually 10 ms. It's possible, that
this time value is not updated quickly enough for acurate time measurement
(delays up to 10ms).
- 42 -
Hardware devices: Header files: Map in the kernel page table the hardware device pages from on-board memory in the range between 640k and 1024K from address `addr' (physical adress) with total size `size' (bytes): void page_mapin(phys_bytes addr, phys_bytes size) Map (mainly PCI) hardware device pages (configuration space, on board memory...) from physical adress 'addr' with total size 'size' to the virtual adress 'virt' with vir_bytes ioremap(addr,size) |
Header
files: Set up a DMA channel: int dma_setup( |
mode */ |
|
int
channel, /* DMA channel */ |
*/ |
|
int count) /* number of bytes to Mode register values: DMA_CHANMASK channel select mask Start a DMA request on a given channel: void dma_start(int channel) Stop DMA transfer by disabling DMA channel for all further DMA operations: |
to move */ operations: |
void dma_done(int channel) |
- 43 - |
Header
files: The following functions should be used by a PCI driver to look for its hardware device: int pcibios_present(void) int pcibios_find_device( int pcibios_find_class( char *pcibios_strerror(int error) Accessing the configuration space: After the driver has detected the device, it usually needs to read from or write to the three adress spaces: memory, port and configuration. In particular, accessing the configuration space is vital to the driver because it's the only way it can find out where the device is mapped in memory and in the I/O space. int pcibios_read_config_byte( int pcibios_read_config_word( int pcibios_read_config_dword( |
- 44 -
unsigned
char where,
unsigned int
*ptr)
int
pcibios_write_config_byte(
unsigned char bus,
unsigned char function,
unsigned char where,
unsigned char
val)
int
pcibios_write_config_word(
unsigned char bus,
unsigned char function,
unsigned char where,
unsigned short
val)
int pcibios_write_config_dword(
For a more detailed explanation |
unsigned char bus, of the PCI stuff check out popular |
Linux documentation. |
- 45 - |
Network driver and ethernet interface
#include <amoeba.h> #include <ethif.h> #include "pktqueue.h" |
This manual page describes the netowork driver functions and management within the Amoeba kernel. Initialization of network drivers From protocols/generic/ethif.c the function eth_init is called. This function initializes all ethernet devices found in the system. eth_init scans through all entries in the struct-array heilist [interface 0,interface 1,..] found in the file etherconf.c placed in the main configuration path for the specific kernel (the location, where also the main Amakefile for the kernel and other configuration files reside- e.g: conf/i80386.ack/kernel/ibm_at/workstation heilist[] is of type hei_t, defined in ethif.h . The first entry contains the name of the interface (e.g. "lance", "NS8390" - stand for many card types using 8390 chip's, ...), the second gives the number of different cards supported by this driver, and the rest points to driver functions important for higher network layers: |
- 46 - struct hard_ethernet_interface_info |
{ |
char *hei_name; |
Allocate local device struct(s) |
|||
} |
int (*hei_init)(); |
Probe and init the card |
typedef hard_ethernet_interface_info hei_t,*hei_p ;
eth_init calls first hei_alloc to allocate and initialize driver specific device-struct and second the probing and initializing routine hei_init. If the probing was succesfull, hei_init returns 1, otherwise 0 or a negative value.
Commonly the driver init routine hei_init(int hardifno,
int softifno, char *ifaddr, allocates and inits the following buffers and pools: |
* Allocate data buffers for the receive packet pool: rdata=(char *)
alloc((vir_bytes)(RCVPOOLSIZE*RCVETHPKTSIZE),0); * Initialize receive packet pool: pkt_init(&rpool,RCVETHPKSIZE,bufs, * RX-Queue management; necessary in almost drivers: Within the ISR the driver collects the packets from the card into preallocated packetbuffers from the receive pool. But teh card driver deliver the packets outside the ISR in an higher level function enqueued by the scheduler to the FLIP-Box. int rxhead,rxtail; * Fill the RX-Queue with packets from the receive pool: for(i=0;i<RQUEUESIZE;i++) * TX-Queue management: |
- 47 -
If the driver
can't send a packet immediately delivered with hei_send, he queues it and
send it later upon a transmitter
interrupt.
int
txhead,txtail;
pkt_p *txpkts;
txhead=txtail=0;
txpkts=(pkt_p *) |
Additional the init routine copies the ethernet address of the card into the char buffer ifaddr , setup irq-handling, fill the local device structure and many more.
Non DMA cards receive and store a ethernet packets in their own on board receiver ring buffer. After, the card triggers the interrupt and the ISR is called. Now the driver pulls the next RX- queue pointer for the new packet:
int
rnext;
if((rnext=rxhead+1)==RQUEUESIZE)
rnext=0;
if(rnext==rxtail)
error(NOFREEPKTSLOT); |
||||
else |
||||
pkt_p; |
||||
} |
The local getpkt() function stores the packet to the buffer address
buf=pkt_offset(pkt);
Additional, a high level receiver routine will be enqueud. This routine delivers the collected packets in the RX-Queue to the FLIP-Box:
int rhead; pkt_p pktin,pktout; pktout=rxpkts[rxtail]; /* Get a new packet from the receive pool */ PKT_GET(pktin,&rpool); pktin->p_admin.pa_size=RCVETHPKTSIZE; proto_setup_input(pktin,eh_t); |
- 48 - pktin->p_contents.pc_totsize= |
||||
} |
||||
error(OUFOFPKT); |
} |
/* put the new packet in the RX-Queue */ rxpkts[rxtail]=pktin; /* deliver the extracted packet */ eth_arrived(softifno,pktout); |
Packet sending
The higher network layer (FLIP-Box) calls the
hei_send ( int ifno , pkt_p pkt )
function. If the driver can send the packet immediately
(the transmitter is not busy), it must take
care about the packet buffer structure. The packet data consists of two
parts:
1. The administration data at address pkt_offset(pkt) from length pkt->p_contents.pc_dirsize
2. The actual data somewhere in the (virtual) user or kernel address space starting at address
pkt->p_contents.pc_virtual
and of length
pkt->p_contents.pc_totsize - pkt->p_contents.pc_dirsize
The local putpkt() routine must send these two areas separately. Sometimes it get in trouble with not aligned (word,long...) data. There are two possibilities:
* The packet
contains only direct data, no virtual: increase the byte count to the
alignment boundary
* The
packet contains both direct and virtual data: Pain. Copy the packet
in a |
If the transmitter is still busy, you must queue the packet in the TX-Queue:
int tnext, txpkts[txhead]=pkt; |
- 49 -
} |
txhead=tnext; |
If the transmitter is ready to send a new packet , he will raise an interrupt. Inside the ISR we enqueue a highlevel transmitting routine. This routine will put the next packet from the TX- Queue (FIFO-order) into the card's tx-buffer:
int tnext; pkt_p pkt; pkt=txpkts[txtaill]; putpkt(pkt); pkt_discard(pkt); |
Because currently all drivers operating in the above manner, it's obvious to use macros for the receive and send packet pools. These macros are defined in the header pktqueue.h in the net driver directory.
These macros make the following assumption about the private device driver structure dev:
struct dev
{ char *name; ... Further the Definitions: RXQUEUESIZE RCVPOOLSIZE/2 (??) The device structure must conatin these variables. The aviable macros are: Check for packets in the receive or send queue: |
CHECKTXQUEUE(dev)
CHECKRXQUEUE(dev)
If a packet can't be send immeadiately (transmitter busy), put it in the TX queue:
PUTTXQUEUE(dev,pkt)
- 50 - A packet was received. Get the next pkt pointer
from the RXQUEUE. Within ISR the driver can't allocate a packet
struct from the RCVPOOL. So it uses a preallocated packet from the
RXQUEUE. PUTRXQUEUE(dev,pkt) Get a packet from the TX queue: GETTXQUEUE(dev,pkt) Get a packet from the RX queue and allocate a noew one from the receive pool: GETRXQUEUE(dev,pkt) |
Take a
look in the source code of various network drivers in the |
directory |
isr[K]
ioport[K] ipc[K] kthread[K]
Document collected and written by: Stefan
Bosse |
- 51 - |