Now anyone can schedule events and get a callback to work as long
as the user data structure that is added for the event begins
with a kore_event data structure.
All event state is now kept in that kore_event structure and renamed
CONN_[READ|WRITE]_POSSIBLE to KORE_EVENT_[READ|WRITE].
A filemap is a way of telling Kore to serve files from a directory
much like a traditional webserver can do.
Kore filemaps only handles files. Kore does not generate directory
indexes or deal with non-regular files.
The way files are sent to a client differs a bit per platform and
build options:
default:
- mmap() backed file transfer due to TLS.
NOTLS=1
- sendfile() under FreeBSD, macOS and Linux.
- mmap() backed file for OpenBSD.
The opened file descriptors/mmap'd regions are cached and reused when
appropriate. If a file is no longer in use it will be closed and evicted
from the cache after 30 seconds.
New API's are available allowing developers to use these facilities via:
void net_send_fileref(struct connection *, struct kore_fileref *);
void http_response_fileref(struct http_request *, struct kore_fileref *);
Kore will attempt to match media types based on file extensions. A few
default types are built-in. Others can be added via the new "http_media_type"
configuration directive.
The HTTP layer used to make a copy of each incoming header and its
value for a request. Stop doing that and make HTTP headers zero-copy
all across the board.
This change comes with some api function changes, notably the
http_request_header() function which now takes a const char ** rather
than a char ** out pointer.
This commit also constifies several members of http_request, beware.
Additional rework how the worker processes deal with the accept lock.
Before:
if a worker held the accept lock and it accepted a new connection
it would release the lock for others and back off for 500ms before
attempting to grab the lock again.
This approach worked but under high load this starts becoming obvious.
Now:
- workers not holding the accept lock and not having any connections
will wait less long before returning from kore_platform_event_wait().
- workers not holding the accept lock will no longer blindly wait
an arbitrary amount in kore_platform_event_wait() but will look
at how long until the next lock grab is and base their timeout
on that.
- if a worker its next_lock timeout is up and failed to grab the
lock it will try again in half the time again.
- the worker process holding the lock will when releasing the lock
double check if it still has space for newer connections, if it does
it will keep the lock until it is full. This prevents the lock from
bouncing between several non busy worker processes all the time.
Additional fixes:
- Reduce the number of times we check the timeout list, only do it twice
per second rather then every event tick.
- Fix solo worker count for TLS (we actually hold two processes, not one).
- Make sure we don't accidentally miscalculate the idle time causing new
connections under heavy load to instantly drop.
- Swap from gettimeofday() to clock_gettime() now that MacOS caught up.
- Change pools to use mmap() for allocating regions.
- Change kore_malloc() to use pools for commonly sized objects.
(split into multiple of 2 buckets, starting at 8 bytes up to 8192).
- Rename kore_mem_free() to kore_free().
The preallocated pools will hold up to 128K of elements per block size.
In case a larger object is to be allocated kore_malloc() will use
malloc() instead.
Setting the handle callback allows your application
to take care of network events for the connection.
Look at the connection state and flags to determine
if read/write is possible and go from there.
See kore_connection_handle() for more details.
This configuration option limits the maximum number
of connections a worker process can accept() in a single
event loop.
It can be used to more evenly spread out incoming connections
across workers when new connections arrive in a burst.
In cases where accept() failed Kore would not relinquish the
lock towards other worker processes.
This becomes evident when dealing with a high number of concurrent
connections to the point the fd table gets full. In this scenario
the worker with the full fd table will spin on attempt to accept
newer connections.
As a bonus, Kore now has allows exactly up to worker_max_connections
of connections per worker before no longer attempting to grab the
accept lock.
Introduces two new configuration knobs:
* socket_backlog (backlog for listen(2))
* http_request_limit
The second one is the most interesting one.
Before, kore would iterate over all received HTTP requests
in its queue before returning out of http_process().
Under heavy load this queue can cause Kore to spend a considerable
amount of time iterating over said queue. With the http_request_limit,
kore will process at MOST http_request_limit requests before returning
back to the event loop.
This means responses to processed requests are sent out much quicker
and allows kore to handle any other incoming requests more gracefully.
- The net code no longer has a recv_queue, instead reuse same recv buffer.
- Introduce net_recv_reset() to reset the recv buffer when needed.
- Have the workers spread the load better between them by slightly
delaying their next accept lock and giving them an accept treshold
so they don't go ahead and keep accepting connections if they end
up winning the race constantly between the workers.
- The kore_worker_acceptlock_release() is no longer available.
- Prepopulate the HTTP server response header that is added to each
response in both normal HTTP and SPDY modes.
- The path and host members of http_request are now allocated on the heap.
These changes overall result better performance on a multicore machine,
especially the worker load changes shine through.
Kore no longer passes the accept lock to the "next in line"
worker but instead all workers will attempt to grab the lock
if they can.
Also remember if we had the lock previous iteration of the
event loop and don't constantly disable/enable the accepting sockets.
Makes Kore scale even better across multiple cpu's.
Tasks are now assigned to available threads instead
of a global task list.
You can now pass messages between your page handler
and the created task using the kore_task_channel_*
functions.
Only one task per time can be assigned to a request
but I feel this is probably a bad design choice.
Preferably we'd want to be able to start tasks
regardless of being in a page handler or not,
this not only ads flexibility but seems like
a better choice overall as it opens a lot more
possibilities about how tasks can be used.
Has support for full async pgsql queries. Most of the logic
is hidden behind a KORE_PGSQL() macro allowing you to insert
these pgsql calls in your page handlers without blocking the
kore worker while the query is going off.
There is place for improvement here, and perhaps KORE_PGSQL won't
stay as I feel this might overcomplicate things instead of making
them simpler as I thought it would.
- Introduce own memory management system on top of malloc to keep track
of all our allocations and free's. Later we should introduce a pooling
mechanism for fixed size allocations (http_request comes to mind).
- Introduce ssl_cipher in configuration.
Memory usage is kind of high right now, but it seems its OpenSSL
doing it rather then Kore.
Instead of waiting until one worker is filled up on connections
the workers find the next lowest loaded worker and will hand
over the lock to them instead. This will cause a nicer spread of load.
Instead of running one accept per event loop, we attempt to accept
as many as worker_max_connections allows.
Refactor net sending/recv code a bit.
new connections and which ones will not be notified for it.
Fixes the thundering herd problem, and nicely spreads out load between
all the workers equally. A configuration option (workers_max_connections)
is available to tweak how many connections a worker will have before
giving up the accept lock.
Two ways are added to this commit for access locking:
- Locking via semaphores.
- Locking via GCC's builtin atomic methods.
The default is running with semaphores disabled (OpenBSD cannot do
sem_init() with pshared set to 1, which is required).
If you want to use semaphores add KORE_USE_SEMAPHORES to CFLAGS,
and -lpthread to LDFLAGS in the Makefile.
Other fixes:
- BSD: add a timeout to kevent().
- Merge kore_worker_wait together, linux knows waitpid() as well.
- Send the correct SIGQUIT signal to workers instead of SIGINT.
- Fix kore_time_ms().
- Log fatal worker messages in syslog.
- Refactor code even more.
- Do not free our own kore_worker structure.