Adding upstream version 2.4.2+debian.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
0fae05cfb7
commit
153471ed4b
64 changed files with 9668 additions and 0 deletions
647
src/resperf.1.in
Normal file
647
src/resperf.1.in
Normal file
|
@ -0,0 +1,647 @@
|
|||
.\" Copyright 2019-2021 OARC, Inc.
|
||||
.\" Copyright 2017-2018 Akamai Technologies
|
||||
.\" Copyright 2006-2016 Nominum, Inc.
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" Licensed under the Apache License, Version 2.0 (the "License");
|
||||
.\" you may not use this file except in compliance with the License.
|
||||
.\" You may obtain a copy of the License at
|
||||
.\"
|
||||
.\" http://www.apache.org/licenses/LICENSE-2.0
|
||||
.\"
|
||||
.\" Unless required by applicable law or agreed to in writing, software
|
||||
.\" distributed under the License is distributed on an "AS IS" BASIS,
|
||||
.\" WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
.\" See the License for the specific language governing permissions and
|
||||
.\" limitations under the License.
|
||||
.TH resperf 1 "@PACKAGE_VERSION@" "resperf"
|
||||
.SH NAME
|
||||
resperf \- test the resolution performance of a caching DNS server
|
||||
.SH SYNOPSIS
|
||||
.hy 0
|
||||
.ad l
|
||||
\fBresperf\-report\fR\ [\fB\-a\ \fIlocal_addr\fB\fR]
|
||||
[\fB\-d\ \fIdatafile\fB\fR]
|
||||
[\fB\-M\ \fImode\fB\fR]
|
||||
[\fB\-s\ \fIserver_addr\fB\fR]
|
||||
[\fB\-p\ \fIport\fB\fR]
|
||||
[\fB\-x\ \fIlocal_port\fB\fR]
|
||||
[\fB\-t\ \fItimeout\fB\fR]
|
||||
[\fB\-b\ \fIbufsize\fB\fR]
|
||||
[\fB\-f\ \fIfamily\fB\fR]
|
||||
[\fB\-e\fR]
|
||||
[\fB\-D\fR]
|
||||
[\fB\-y\ \fI[alg:]name:secret\fB\fR]
|
||||
[\fB\-h\fR]
|
||||
[\fB\-i\ \fIinterval\fB\fR]
|
||||
[\fB\-m\ \fImax_qps\fB\fR]
|
||||
[\fB\-r\ \fIrampup_time\fB\fR]
|
||||
[\fB\-c\ \fIconstant_traffic_time\fB\fR]
|
||||
[\fB\-L\ \fImax_loss\fB\fR]
|
||||
[\fB\-C\ \fIclients\fB\fR]
|
||||
[\fB\-q\ \fImax_outstanding\fB\fR]
|
||||
[\fB\-v\fR]
|
||||
.ad
|
||||
.hy
|
||||
.hy 0
|
||||
.ad l
|
||||
|
||||
\fBresperf\fR\ [\fB\-a\ \fIlocal_addr\fB\fR]
|
||||
[\fB\-d\ \fIdatafile\fB\fR]
|
||||
[\fB\-M\ \fImode\fB\fR]
|
||||
[\fB\-s\ \fIserver_addr\fB\fR]
|
||||
[\fB\-p\ \fIport\fB\fR]
|
||||
[\fB\-x\ \fIlocal_port\fB\fR]
|
||||
[\fB\-t\ \fItimeout\fB\fR]
|
||||
[\fB\-b\ \fIbufsize\fB\fR]
|
||||
[\fB\-f\ \fIfamily\fB\fR]
|
||||
[\fB\-e\fR]
|
||||
[\fB\-D\fR]
|
||||
[\fB\-y\ \fI[alg:]name:secret\fB\fR]
|
||||
[\fB\-h\fR]
|
||||
[\fB\-i\ \fIinterval\fB\fR]
|
||||
[\fB\-m\ \fImax_qps\fB\fR]
|
||||
[\fB\-P\ \fIplot_data_file\fB\fR]
|
||||
[\fB\-r\ \fIrampup_time\fB\fR]
|
||||
[\fB\-c\ \fIconstant_traffic_time\fB\fR]
|
||||
[\fB\-L\ \fImax_loss\fB\fR]
|
||||
[\fB\-C\ \fIclients\fB\fR]
|
||||
[\fB\-q\ \fImax_outstanding\fB\fR]
|
||||
[\fB\-v\fR]
|
||||
.ad
|
||||
.hy
|
||||
.SH DESCRIPTION
|
||||
\fBresperf\fR is a companion tool to \fBdnsperf\fR. \fBdnsperf\fR was
|
||||
primarily designed for benchmarking authoritative servers, and it does not
|
||||
work well with caching servers that are talking to the live Internet. One
|
||||
reason for this is that dnsperf uses a "self-pacing" approach, which is
|
||||
based on the assumption that you can keep the server 100% busy simply by
|
||||
sending it a small burst of back-to-back queries to fill up network buffers,
|
||||
and then send a new query whenever you get a response back. This approach
|
||||
works well for authoritative servers that process queries in order and one
|
||||
at a time; it also works pretty well for a caching server in a closed
|
||||
laboratory environment talking to a simulated Internet that's all on the
|
||||
same LAN. Unfortunately, it does not work well with a caching server talking
|
||||
to the actual Internet, which may need to work on thousands of queries in
|
||||
parallel to achieve its maximum throughput. There have been numerous
|
||||
attempts to use dnsperf (or its predecessor, queryperf) for benchmarking
|
||||
live caching servers, usually with poor results. Therefore, a separate tool
|
||||
designed specifically for caching servers is needed.
|
||||
.SS "How resperf works"
|
||||
Unlike the "self-pacing" approach of dnsperf, resperf works by sending DNS
|
||||
queries at a controlled, steadily increasing rate. By default, resperf will
|
||||
send traffic for 60 seconds, linearly increasing the amount of traffic from
|
||||
zero to 100,000 queries per second.
|
||||
|
||||
During the test, resperf listens for responses from the server and keeps
|
||||
track of response rates, failure rates, and latencies. It will also continue
|
||||
listening for responses for an additional 40 seconds after it has stopped
|
||||
sending traffic, so that there is time for the server to respond to the last
|
||||
queries sent. This time period was chosen to be longer than the overall
|
||||
query timeout of both Nominum CacheServe and current versions of BIND.
|
||||
|
||||
If the test is successful, the query rate will at some point exceed the
|
||||
capacity of the server and queries will be dropped, causing the response
|
||||
rate to stop growing or even decrease as the query rate increases.
|
||||
|
||||
The result of the test is a set of measurements of the query rate, response
|
||||
rate, failure response rate, and average query latency as functions of time.
|
||||
.SS "What you will need"
|
||||
Benchmarking a live caching server is serious business. A fast caching
|
||||
server like Nominum CacheServe, resolving a mix of cacheable and
|
||||
non-cacheable queries typical of ISP customer traffic, is capable of
|
||||
resolving well over 1,000,000 queries per second. In the process, it will
|
||||
send more than 40,000 queries per second to authoritative servers on the
|
||||
Internet, and receive responses to most of them. Assuming an average request
|
||||
size of 50 bytes and a response size of 150 bytes, this amounts to some 1216
|
||||
Mbps of outgoing and 448 Mbps of incoming traffic. If your Internet
|
||||
connection can't handle the bandwidth, you will end up measuring the speed
|
||||
of the connection, not the server, and may saturate the connection causing a
|
||||
degradation in service for other users.
|
||||
|
||||
Make sure there is no stateful firewall between the server and the Internet,
|
||||
because most of them can't handle the amount of UDP traffic the test will
|
||||
generate and will end up dropping packets, skewing the test results. Some
|
||||
will even lock up or crash.
|
||||
|
||||
You should run resperf on a machine separate from the server under test, on
|
||||
the same LAN. Preferably, this should be a Gigabit Ethernet network. The
|
||||
machine running resperf should be at least as fast as the machine being
|
||||
tested; otherwise, it may end up being the bottleneck.
|
||||
|
||||
There should be no other applications running on the machine running
|
||||
resperf. Performance testing at the traffic levels involved is essentially a
|
||||
hard real-time application - consider the fact that at a query rate of
|
||||
100,000 queries per second, if resperf gets delayed by just 1/100 of a
|
||||
second, 1000 incoming UDP packets will arrive in the meantime. This is more
|
||||
than most operating systems will buffer, which means packets will be
|
||||
dropped.
|
||||
|
||||
Because the granularity of the timers provided by operating systems is
|
||||
typically too coarse to accurately schedule packet transmissions at
|
||||
sub-millisecond intervals, resperf will busy-wait between packet
|
||||
transmissions, constantly polling for responses in the meantime. Therefore,
|
||||
it is normal for resperf to consume 100% CPU during the whole test run, even
|
||||
during periods where query rates are relatively low.
|
||||
|
||||
You will also need a set of test queries in the \fBdnsperf\fR file format.
|
||||
See the \fBdnsperf\fR man page for instructions on how to construct this
|
||||
query file. To make the test as realistic as possible, the queries should be
|
||||
derived from recorded production client DNS traffic, without removing
|
||||
duplicate queries or other filtering. With the default settings, resperf
|
||||
will use up to 3 million queries in each test run.
|
||||
|
||||
If the caching server to be tested has a configurable limit on the number of
|
||||
simultaneous resolutions, like the \fBmax\-recursive\-clients\fR statement
|
||||
in Nominum CacheServe or the \fBrecursive\-clients\fR option in BIND 9, you
|
||||
will probably have to increase it. As a starting point, we recommend a value
|
||||
of 10000 for Nominum CacheServe and 100000 for BIND 9. Should the limit be
|
||||
reached, it will show up in the plots as an increase in the number of
|
||||
failure responses.
|
||||
|
||||
The server being tested should be restarted at the beginning of each test to
|
||||
make sure it is starting with an empty cache. If the cache already contains
|
||||
data from a previous test run that used the same set of queries, almost all
|
||||
queries will be answered from the cache, yielding inflated performance
|
||||
numbers.
|
||||
|
||||
To use the \fBresperf\-report\fR script, you need to have \fBgnuplot\fR
|
||||
installed. Make sure your installed version of \fBgnuplot\fR supports the
|
||||
png terminal driver. If your \fBgnuplot\fR doesn't support png but does
|
||||
support gif, you can change the line saying terminal=png in the
|
||||
\fBresperf\-report\fR script to terminal=gif.
|
||||
.SS "Running the test"
|
||||
Resperf is typically invoked via the \fBresperf\-report\fR script, which
|
||||
will run \fBresperf\fR with its output redirected to a file and then
|
||||
automatically generate an illustrated report in HTML format. Command line
|
||||
arguments given to resperf-report will be passed on unchanged to resperf.
|
||||
|
||||
When running resperf-report, you will need to specify at least the server IP
|
||||
address and the query data file. A typical invocation will look like
|
||||
.RS
|
||||
.hy 0
|
||||
|
||||
.nf
|
||||
resperf\-report \-s 10.0.0.2 \-d queryfile
|
||||
.fi
|
||||
.hy
|
||||
.RE
|
||||
|
||||
With default settings, the test run will take at most 100 seconds (60
|
||||
seconds of ramping up traffic and then 40 seconds of waiting for responses),
|
||||
but in practice, the 60-second traffic phase will usually be cut short. To
|
||||
be precise, resperf can transition from the traffic-sending phase to the
|
||||
waiting-for-responses phase in three different ways:
|
||||
.IP \(bu 2
|
||||
Running for the full allotted time and successfully reaching the maximum
|
||||
query rate (by default, 60 seconds and 100,000 qps, respectively). Since
|
||||
this is a very high query rate, this will rarely happen (with today's
|
||||
hardware); one of the other two conditions listed below will usually occur
|
||||
first.
|
||||
.IP \(bu 2
|
||||
Exceeding 65,536 outstanding queries. This often happens as a result of
|
||||
(successfully) exceeding the capacity of the server being tested, causing
|
||||
the excess queries to be dropped. The limit of 65,536 queries comes from the
|
||||
number of possible values for the ID field in the DNS packet. Resperf needs
|
||||
to allocate a unique ID for each outstanding query, and is therefore unable
|
||||
to send further queries if the set of possible IDs is exhausted.
|
||||
.IP \(bu 2
|
||||
When resperf finds itself unable to send queries fast enough. Resperf will
|
||||
notice if it is falling behind in its scheduled query transmissions, and if
|
||||
this backlog reaches 1000 queries, it will print a message like "Fell behind
|
||||
by 1000 queries" (or whatever the actual number is at the time) and stop
|
||||
sending traffic.
|
||||
.PP
|
||||
Regardless of which of the above conditions caused the traffic-sending phase
|
||||
of the test to end, you should examine the resulting plots to make sure the
|
||||
server's response rate is flattening out toward the end of the test. If it
|
||||
is not, then you are not loading the server enough. If you are getting the
|
||||
"Fell behind" message, make sure that the machine running resperf is fast
|
||||
enough and has no other applications running.
|
||||
|
||||
You should also monitor the CPU usage of the server under test. It should
|
||||
reach close to 100% CPU at the point of maximum traffic; if it does not, you
|
||||
most likely have a bottleneck in some other part of your test setup, for
|
||||
example, your external Internet connection.
|
||||
|
||||
The report generated by \fBresperf\-report\fR will be stored with a unique
|
||||
file name based on the current date and time, e.g.,
|
||||
\fI20060812-1550.html\fR. The PNG images of the plots and other auxiliary
|
||||
files will be stored in separate files beginning with the same date-time
|
||||
string. To view the report, simply open the \fI.html\fR file in a web
|
||||
browser.
|
||||
|
||||
If you need to copy the report to a separate machine for viewing, make sure
|
||||
to copy the .png files along with the .html file (or simply copy all the
|
||||
files, e.g., using scp 20060812-1550.* host:directory/).
|
||||
.SS "Interpreting the report"
|
||||
The \fI.html\fR file produced by \fBresperf\-report\fR consists of two
|
||||
sections. The first section, "Resperf output", contains output from the
|
||||
\fBresperf\fR program such as progress messages, a summary of the command
|
||||
line arguments, and summary statistics. The second section, "Plots",
|
||||
contains two plots generated by \fBgnuplot\fR: "Query/response/failure rate"
|
||||
and "Latency".
|
||||
|
||||
The "Query/response/failure rate" plot contains three graphs. The "Queries
|
||||
sent per second" graph shows the amount of traffic being sent to the server;
|
||||
this should be very close to a straight diagonal line, reflecting the linear
|
||||
ramp-up of traffic.
|
||||
|
||||
The "Total responses received per second" graph shows how many of the
|
||||
queries received a response from the server. All responses are counted,
|
||||
whether successful (NOERROR or NXDOMAIN) or not (e.g., SERVFAIL).
|
||||
|
||||
The "Failure responses received per second" graph shows how many of the
|
||||
queries received a failure response. A response is considered to be a
|
||||
failure if its RCODE is neither NOERROR nor NXDOMAIN.
|
||||
|
||||
By visually inspecting the graphs, you can get an idea of how the server
|
||||
behaves under increasing load. The "Total responses received per second"
|
||||
graph will initially closely follow the "Queries sent per second" graph
|
||||
(often rendering it invisible in the plot as the two graphs are plotted on
|
||||
top of one another), but when the load exceeds the server's capacity, the
|
||||
"Total responses received per second" graph may diverge from the "Queries
|
||||
sent per second" graph and flatten out, indicating that some of the queries
|
||||
are being dropped.
|
||||
|
||||
The "Failure responses received per second" graph will normally show a
|
||||
roughly linear ramp close to the bottom of the plot with some random
|
||||
fluctuation, since typical query traffic will contain some small percentage
|
||||
of failing queries randomly interspersed with the successful ones. As the
|
||||
total traffic increases, the number of failures will increase
|
||||
proportionally.
|
||||
|
||||
If the "Failure responses received per second" graph turns sharply upwards,
|
||||
this can be another indication that the load has exceeded the server's
|
||||
capacity. This will happen if the server reacts to overload by sending
|
||||
SERVFAIL responses rather than by dropping queries. Since Nominum CacheServe
|
||||
and BIND 9 will both respond with SERVFAIL when they exceed their
|
||||
\fBmax\-recursive\-clients\fR or \fBrecursive\-clients\fR limit,
|
||||
respectively, a sudden increase in the number of failures could mean that
|
||||
the limit needs to be increased.
|
||||
|
||||
The "Latency" plot contains a single graph marked "Average latency". This
|
||||
shows how the latency varies during the course of the test. Typically, the
|
||||
latency graph will exhibit a downwards trend because the cache hit rate
|
||||
improves as ever more responses are cached during the test, and the latency
|
||||
for a cache hit is much smaller than for a cache miss. The latency graph is
|
||||
provided as an aid in determining the point where the server gets
|
||||
overloaded, which can be seen as a sharp upwards turn in the graph. The
|
||||
latency graph is not intended for making absolute latency measurements or
|
||||
comparisons between servers; the latencies shown in the graph are not
|
||||
representative of production latencies due to the initially empty cache and
|
||||
the deliberate overloading of the server towards the end of the test.
|
||||
|
||||
Note that all measurements are displayed on the plot at the horizontal
|
||||
position corresponding to the point in time when the query was sent, not
|
||||
when the response (if any) was received. This makes it it easy to compare
|
||||
the query and response rates; for example, if no queries are dropped, the
|
||||
query and response graphs will be identical. As another example, if the plot
|
||||
shows 10% failure responses at t=5 seconds, this means that 10% of the
|
||||
queries sent at t=5 seconds eventually failed, not that 10% of the responses
|
||||
received at t=5 seconds were failures.
|
||||
.SS "Determining the server's maximum throughput"
|
||||
Often, the goal of running \fBresperf\fR is to determine the server's
|
||||
maximum throughput, in other words, the number of queries per second it is
|
||||
capable of handling. This is not always an easy task, because as a server is
|
||||
driven into overload, the service it provides may deteriorate gradually, and
|
||||
this deterioration can manifest itself either as queries being dropped, as
|
||||
an increase in the number of SERVFAIL responses, or an increase in latency.
|
||||
The maximum throughput may be defined as the highest level of traffic at
|
||||
which the server still provides an acceptable level of service, but that
|
||||
means you first need to decide what an acceptable level of service means in
|
||||
terms of packet drop percentage, SERVFAIL percentage, and latency.
|
||||
|
||||
The summary statistics in the "Resperf output" section of the report
|
||||
contains a "Maximum throughput" value which by default is determined from
|
||||
the maximum rate at which the server was able to return responses, without
|
||||
regard to the number of queries being dropped or failing at that point. This
|
||||
method of throughput measurement has the advantage of simplicity, but it may
|
||||
or may not be appropriate for your needs; the reported value should always
|
||||
be validated by a visual inspection of the graphs to ensure that service has
|
||||
not already deteriorated unacceptably before the maximum response rate is
|
||||
reached. It may also be helpful to look at the "Lost at that point" value in
|
||||
the summary statistics; this indicates the percentage of the queries that
|
||||
was being dropped at the point in the test when the maximum throughput was
|
||||
reached.
|
||||
|
||||
Alternatively, you can make resperf report the throughput at the point in
|
||||
the test where the percentage of queries dropped exceeds a given limit (or
|
||||
the maximum as above if the limit is never exceeded). This can be a more
|
||||
realistic indication of how much the server can be loaded while still
|
||||
providing an acceptable level of service. This is done using the \fB\-L\fR
|
||||
command line option; for example, specifying \fB\-L 10\fR makes resperf
|
||||
report the highest throughput reached before the server starts dropping more
|
||||
than 10% of the queries.
|
||||
|
||||
There is no corresponding way of automatically constraining results based on
|
||||
the number of failed queries, because unlike dropped queries, resolution
|
||||
failures will occur even when the the server is not overloaded, and the
|
||||
number of such failures is heavily dependent on the query data and network
|
||||
conditions. Therefore, the plots should be manually inspected to ensure that
|
||||
there is not an abnormal number of failures.
|
||||
.SH "GENERATING CONSTANT TRAFFIC"
|
||||
In addition to ramping up traffic linearly, \fBresperf\fR also has the
|
||||
capability to send a constant stream of traffic. This can be useful when
|
||||
using \fBresperf\fR for tasks other than performance measurement; for
|
||||
example, it can be used to "soak test" a server by subjecting it to a
|
||||
sustained load for an extended period of time.
|
||||
|
||||
To generate a constant traffic load, use the \fB\-c\fR command line option,
|
||||
together with the \fB\-m\fR option which specifies the desired constant
|
||||
query rate. For example, to send 10000 queries per second for an hour, use
|
||||
\fB\-m 10000 \-c 3600\fR. This will include the usual 30-second gradual
|
||||
ramp-up of traffic at the beginning, which may be useful to avoid initially
|
||||
overwhelming a server that is starting with an empty cache. To start the
|
||||
onslaught of traffic instantly, use \fB\-m 10000 \-c 3600 \-r 0\fR.
|
||||
|
||||
To be precise, \fBresperf\fR will do a linear ramp-up of traffic from 0 to
|
||||
\fB\-m\fR queries per second over a period of \fB\-r\fR seconds, followed by
|
||||
a plateau of steady traffic at \fB\-m\fR queries per second lasting for
|
||||
\fB\-c\fR seconds, followed by waiting for responses for an extra 40
|
||||
seconds. Either the ramp-up or the plateau can be suppressed by supplying a
|
||||
duration of zero seconds with \fB\-r 0\fR and \fB\-c 0\fR, respectively. The
|
||||
latter is the default.
|
||||
|
||||
Sending traffic at high rates for hours on end will of course require very
|
||||
large amounts of input data. Also, a long-running test will generate a large
|
||||
amount of plot data, which is kept in memory for the duration of the test.
|
||||
To reduce the memory usage and the size of the plot file, consider
|
||||
increasing the interval between measurements from the default of 0.5 seconds
|
||||
using the \fB\-i\fR option in long-running tests.
|
||||
|
||||
When using \fBresperf\fR for long-running tests, it is important that the
|
||||
traffic rate specified using the \fB\-m\fR is one that both \fBresperf\fR
|
||||
itself and the server under test can sustain. Otherwise, the test is likely
|
||||
to be cut short as a result of either running out of query IDs (because of
|
||||
large numbers of dropped queries) or of resperf falling behind its
|
||||
transmission schedule.
|
||||
.SH OPTIONS
|
||||
Because the \fBresperf\-report\fR script passes its command line options
|
||||
directly to the \fBresperf\fR programs, they both accept the same set of
|
||||
options, with one exception: \fBresperf\-report\fR automatically adds an
|
||||
appropriate \fB\-P\fR to the \fBresperf\fR command line, and therefore does
|
||||
not itself take a \fB\-P\fR option.
|
||||
|
||||
\fB-d \fIdatafile\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the input data file. If not specified, \fBresperf\fR will read
|
||||
from standard input.
|
||||
.RE
|
||||
|
||||
\fB-M \fImode\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the transport mode to use, "udp", "tcp" or "tls". Default is "udp".
|
||||
.RE
|
||||
|
||||
\fB-s \fIserver_addr\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the name or address of the server to which requests will be sent.
|
||||
The default is the loopback address, 127.0.0.1.
|
||||
.RE
|
||||
|
||||
\fB-p \fIport\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Sets the port on which the DNS packets are sent. If not specified, the
|
||||
standard DNS port (udp/tcp 53, tls 853) is used.
|
||||
.RE
|
||||
|
||||
\fB-a \fIlocal_addr\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the local address from which to send requests. The default is the
|
||||
wildcard address.
|
||||
.RE
|
||||
|
||||
\fB-x \fIlocal_port\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the local port from which to send requests. The default is the
|
||||
wildcard port (0).
|
||||
|
||||
If acting as multiple clients and the wildcard port is used, each client
|
||||
will use a different random port. If a port is specified, the clients will
|
||||
use a range of ports starting with the specified one.
|
||||
.RE
|
||||
|
||||
\fB-t \fItimeout\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the request timeout value, in seconds. \fBresperf\fR will no
|
||||
longer wait for a response to a particular request after this many seconds
|
||||
have elapsed. The default is 45 seconds.
|
||||
|
||||
\fBresperf\fR times out unanswered requests in order to reclaim query IDs so
|
||||
that the query ID space will not be exhausted in a long-running test, such
|
||||
as when "soak testing" a server for an day with \fB\-m 10000 \-c 86400\fR.
|
||||
The timeouts and the ability to tune them are of little use in the more
|
||||
typical use case of a performance test lasting only a minute or two.
|
||||
|
||||
The default timeout of 45 seconds was chosen to be longer than the query
|
||||
timeout of current caching servers. Note that this is longer than the
|
||||
corresponding default in \fBdnsperf\fR, because caching servers can take
|
||||
many orders of magnitude longer to answer a query than authoritative servers
|
||||
do.
|
||||
|
||||
If a short timeout is used, there is a possibility that \fBresperf\fR will
|
||||
receive a response after the corresponding request has timed out; in this
|
||||
case, a message like Warning: Received a response with an unexpected id: 141
|
||||
will be printed.
|
||||
.RE
|
||||
|
||||
\fB-b \fIbufsize\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Sets the size of the socket's send and receive buffers, in kilobytes. If not
|
||||
specified, the operating system's default is used.
|
||||
.RE
|
||||
|
||||
\fB-f \fIfamily\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the address family used for sending DNS packets. The possible
|
||||
values are "inet", "inet6", or "any". If "any" (the default value) is
|
||||
specified, \fBresperf\fR will use whichever address family is appropriate
|
||||
for the server it is sending packets to.
|
||||
.RE
|
||||
|
||||
\fB-e\fR
|
||||
.br
|
||||
.RS
|
||||
Enables EDNS0 [RFC2671], by adding an OPT record to all packets sent.
|
||||
.RE
|
||||
|
||||
\fB-D\fR
|
||||
.br
|
||||
.RS
|
||||
Sets the DO (DNSSEC OK) bit [RFC3225] in all packets sent. This also enables
|
||||
EDNS0, which is required for DNSSEC.
|
||||
.RE
|
||||
|
||||
\fB-y \fI[alg:]name:secret\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Add a TSIG record [RFC2845] to all packets sent, using the specified TSIG
|
||||
key algorithm, name and secret, where the algorithm defaults to hmac-md5 and
|
||||
the secret is expressed as a base-64 encoded string.
|
||||
.RE
|
||||
|
||||
\fB-h\fR
|
||||
.br
|
||||
.RS
|
||||
Print a usage statement and exit.
|
||||
.RE
|
||||
|
||||
\fB-i \fIinterval\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the time interval between data points in the plot file. The
|
||||
default is 0.5 seconds.
|
||||
.RE
|
||||
|
||||
\fB-m \fImax_qps\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the target maximum query rate (in queries per second). This should
|
||||
be higher than the expected maximum throughput of the server being tested.
|
||||
Traffic will be ramped up at a linearly increasing rate until this value is
|
||||
reached, or until one of the other conditions described in the section
|
||||
"Running the test" occurs. The default is 100000 queries per second.
|
||||
.RE
|
||||
|
||||
\fB-P \fIplot_data_file\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the name of the plot data file. The default is
|
||||
\fIresperf.gnuplot\fR.
|
||||
.RE
|
||||
|
||||
\fB-r \fIrampup_time\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the length of time over which traffic will be ramped up. The
|
||||
default is 60 seconds.
|
||||
.RE
|
||||
|
||||
\fB-c \fIconstant_traffic_time\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the length of time for which traffic will be sent at a constant
|
||||
rate following the initial ramp-up. The default is 0 seconds, meaning no
|
||||
sending of traffic at a constant rate will be done.
|
||||
.RE
|
||||
|
||||
\fB-L \fImax_loss\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Specifies the maximum acceptable query loss percentage for purposes of
|
||||
determining the maximum throughput value. The default is 100%, meaning that
|
||||
\fBresperf\fR will measure the maximum throughput without regard to query
|
||||
loss.
|
||||
.RE
|
||||
|
||||
\fB-C \fIclients\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Act as multiple clients. Requests are sent from multiple sockets. The
|
||||
default is to act as 1 client.
|
||||
.RE
|
||||
|
||||
\fB-q \fImax_outstanding\fB\fR
|
||||
.br
|
||||
.RS
|
||||
Sets the maximum number of outstanding requests. \fBresperf\fR will stop
|
||||
ramping up traffic when this many queries are outstanding. The default is
|
||||
64k, and the limit is 64k per client.
|
||||
.RE
|
||||
|
||||
\fB-v\fR
|
||||
.br
|
||||
.RS
|
||||
Enables verbose mode to report about network readiness and congestion.
|
||||
.RE
|
||||
.SH "THE PLOT DATA FILE"
|
||||
The plot data file is written by the \fBresperf\fR program and contains the
|
||||
data to be plotted using \fBgnuplot\fR. When running \fBresperf\fR via the
|
||||
\fBresperf\-report\fR script, there is no need for the user to deal with
|
||||
this file directly, but its format and contents are documented here for
|
||||
completeness and in case you wish to run \fBresperf\fR directly and use its
|
||||
output for purposes other than viewing it with \fBgnuplot\fR.
|
||||
|
||||
The first line of the file is a comment identifying the fields. It may be
|
||||
recognized as a comment by its leading hash sign (#).
|
||||
|
||||
Subsequent lines contain the actual plot data. For purposes of generating
|
||||
the plot data file, the test run is divided into time intervals of 0.5
|
||||
seconds (or some other length of time specified with the \fB\-i\fR command
|
||||
line option). Each line corresponds to one such interval, and contains the
|
||||
following values as floating-point numbers:
|
||||
|
||||
\fBTime\fR
|
||||
.br
|
||||
.RS
|
||||
The midpoint of this time interval, in seconds since the beginning of the
|
||||
run
|
||||
.RE
|
||||
|
||||
\fBTarget queries per second\fR
|
||||
.br
|
||||
.RS
|
||||
The number of queries per second scheduled to be sent in this time interval
|
||||
.RE
|
||||
|
||||
\fBActual queries per second\fR
|
||||
.br
|
||||
.RS
|
||||
The number of queries per second actually sent in this time interval
|
||||
.RE
|
||||
|
||||
\fBResponses per second\fR
|
||||
.br
|
||||
.RS
|
||||
The number of responses received corresponding to queries sent in this time
|
||||
interval, divided by the length of the interval
|
||||
.RE
|
||||
|
||||
\fBFailures per second\fR
|
||||
.br
|
||||
.RS
|
||||
The number of responses received corresponding to queries sent in this time
|
||||
interval and having an RCODE other than NOERROR or NXDOMAIN, divided by the
|
||||
length of the interval
|
||||
.RE
|
||||
|
||||
\fBAverage latency\fR
|
||||
.br
|
||||
.RS
|
||||
The average time between sending the query and receiving a response, for
|
||||
queries sent in this time interval
|
||||
.RE
|
||||
.SH "SEE ALSO"
|
||||
\fBdnsperf\fR(1)
|
||||
.SH AUTHOR
|
||||
Nominum, Inc.
|
||||
.LP
|
||||
Maintained by DNS-OARC
|
||||
.LP
|
||||
.RS
|
||||
.I https://www.dns-oarc.net/
|
||||
.RE
|
||||
.LP
|
||||
.SH BUGS
|
||||
For issues and feature requests please use:
|
||||
.LP
|
||||
.RS
|
||||
\fI@PACKAGE_URL@\fP
|
||||
.RE
|
||||
.LP
|
||||
For question and help please use:
|
||||
.LP
|
||||
.RS
|
||||
\fI@PACKAGE_BUGREPORT@\fP
|
||||
.RE
|
||||
.LP
|
Loading…
Add table
Add a link
Reference in a new issue