648 lines
27 KiB
Groff
648 lines
27 KiB
Groff
|
.\" Copyright 2019-2021 OARC, Inc.
|
||
|
.\" Copyright 2017-2018 Akamai Technologies
|
||
|
.\" Copyright 2006-2016 Nominum, Inc.
|
||
|
.\" All rights reserved.
|
||
|
.\"
|
||
|
.\" Licensed under the Apache License, Version 2.0 (the "License");
|
||
|
.\" you may not use this file except in compliance with the License.
|
||
|
.\" You may obtain a copy of the License at
|
||
|
.\"
|
||
|
.\" http://www.apache.org/licenses/LICENSE-2.0
|
||
|
.\"
|
||
|
.\" Unless required by applicable law or agreed to in writing, software
|
||
|
.\" distributed under the License is distributed on an "AS IS" BASIS,
|
||
|
.\" WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||
|
.\" See the License for the specific language governing permissions and
|
||
|
.\" limitations under the License.
|
||
|
.TH resperf 1 "@PACKAGE_VERSION@" "resperf"
|
||
|
.SH NAME
|
||
|
resperf \- test the resolution performance of a caching DNS server
|
||
|
.SH SYNOPSIS
|
||
|
.hy 0
|
||
|
.ad l
|
||
|
\fBresperf\-report\fR\ [\fB\-a\ \fIlocal_addr\fB\fR]
|
||
|
[\fB\-d\ \fIdatafile\fB\fR]
|
||
|
[\fB\-M\ \fImode\fB\fR]
|
||
|
[\fB\-s\ \fIserver_addr\fB\fR]
|
||
|
[\fB\-p\ \fIport\fB\fR]
|
||
|
[\fB\-x\ \fIlocal_port\fB\fR]
|
||
|
[\fB\-t\ \fItimeout\fB\fR]
|
||
|
[\fB\-b\ \fIbufsize\fB\fR]
|
||
|
[\fB\-f\ \fIfamily\fB\fR]
|
||
|
[\fB\-e\fR]
|
||
|
[\fB\-D\fR]
|
||
|
[\fB\-y\ \fI[alg:]name:secret\fB\fR]
|
||
|
[\fB\-h\fR]
|
||
|
[\fB\-i\ \fIinterval\fB\fR]
|
||
|
[\fB\-m\ \fImax_qps\fB\fR]
|
||
|
[\fB\-r\ \fIrampup_time\fB\fR]
|
||
|
[\fB\-c\ \fIconstant_traffic_time\fB\fR]
|
||
|
[\fB\-L\ \fImax_loss\fB\fR]
|
||
|
[\fB\-C\ \fIclients\fB\fR]
|
||
|
[\fB\-q\ \fImax_outstanding\fB\fR]
|
||
|
[\fB\-v\fR]
|
||
|
.ad
|
||
|
.hy
|
||
|
.hy 0
|
||
|
.ad l
|
||
|
|
||
|
\fBresperf\fR\ [\fB\-a\ \fIlocal_addr\fB\fR]
|
||
|
[\fB\-d\ \fIdatafile\fB\fR]
|
||
|
[\fB\-M\ \fImode\fB\fR]
|
||
|
[\fB\-s\ \fIserver_addr\fB\fR]
|
||
|
[\fB\-p\ \fIport\fB\fR]
|
||
|
[\fB\-x\ \fIlocal_port\fB\fR]
|
||
|
[\fB\-t\ \fItimeout\fB\fR]
|
||
|
[\fB\-b\ \fIbufsize\fB\fR]
|
||
|
[\fB\-f\ \fIfamily\fB\fR]
|
||
|
[\fB\-e\fR]
|
||
|
[\fB\-D\fR]
|
||
|
[\fB\-y\ \fI[alg:]name:secret\fB\fR]
|
||
|
[\fB\-h\fR]
|
||
|
[\fB\-i\ \fIinterval\fB\fR]
|
||
|
[\fB\-m\ \fImax_qps\fB\fR]
|
||
|
[\fB\-P\ \fIplot_data_file\fB\fR]
|
||
|
[\fB\-r\ \fIrampup_time\fB\fR]
|
||
|
[\fB\-c\ \fIconstant_traffic_time\fB\fR]
|
||
|
[\fB\-L\ \fImax_loss\fB\fR]
|
||
|
[\fB\-C\ \fIclients\fB\fR]
|
||
|
[\fB\-q\ \fImax_outstanding\fB\fR]
|
||
|
[\fB\-v\fR]
|
||
|
.ad
|
||
|
.hy
|
||
|
.SH DESCRIPTION
|
||
|
\fBresperf\fR is a companion tool to \fBdnsperf\fR. \fBdnsperf\fR was
|
||
|
primarily designed for benchmarking authoritative servers, and it does not
|
||
|
work well with caching servers that are talking to the live Internet. One
|
||
|
reason for this is that dnsperf uses a "self-pacing" approach, which is
|
||
|
based on the assumption that you can keep the server 100% busy simply by
|
||
|
sending it a small burst of back-to-back queries to fill up network buffers,
|
||
|
and then send a new query whenever you get a response back. This approach
|
||
|
works well for authoritative servers that process queries in order and one
|
||
|
at a time; it also works pretty well for a caching server in a closed
|
||
|
laboratory environment talking to a simulated Internet that's all on the
|
||
|
same LAN. Unfortunately, it does not work well with a caching server talking
|
||
|
to the actual Internet, which may need to work on thousands of queries in
|
||
|
parallel to achieve its maximum throughput. There have been numerous
|
||
|
attempts to use dnsperf (or its predecessor, queryperf) for benchmarking
|
||
|
live caching servers, usually with poor results. Therefore, a separate tool
|
||
|
designed specifically for caching servers is needed.
|
||
|
.SS "How resperf works"
|
||
|
Unlike the "self-pacing" approach of dnsperf, resperf works by sending DNS
|
||
|
queries at a controlled, steadily increasing rate. By default, resperf will
|
||
|
send traffic for 60 seconds, linearly increasing the amount of traffic from
|
||
|
zero to 100,000 queries per second.
|
||
|
|
||
|
During the test, resperf listens for responses from the server and keeps
|
||
|
track of response rates, failure rates, and latencies. It will also continue
|
||
|
listening for responses for an additional 40 seconds after it has stopped
|
||
|
sending traffic, so that there is time for the server to respond to the last
|
||
|
queries sent. This time period was chosen to be longer than the overall
|
||
|
query timeout of both Nominum CacheServe and current versions of BIND.
|
||
|
|
||
|
If the test is successful, the query rate will at some point exceed the
|
||
|
capacity of the server and queries will be dropped, causing the response
|
||
|
rate to stop growing or even decrease as the query rate increases.
|
||
|
|
||
|
The result of the test is a set of measurements of the query rate, response
|
||
|
rate, failure response rate, and average query latency as functions of time.
|
||
|
.SS "What you will need"
|
||
|
Benchmarking a live caching server is serious business. A fast caching
|
||
|
server like Nominum CacheServe, resolving a mix of cacheable and
|
||
|
non-cacheable queries typical of ISP customer traffic, is capable of
|
||
|
resolving well over 1,000,000 queries per second. In the process, it will
|
||
|
send more than 40,000 queries per second to authoritative servers on the
|
||
|
Internet, and receive responses to most of them. Assuming an average request
|
||
|
size of 50 bytes and a response size of 150 bytes, this amounts to some 1216
|
||
|
Mbps of outgoing and 448 Mbps of incoming traffic. If your Internet
|
||
|
connection can't handle the bandwidth, you will end up measuring the speed
|
||
|
of the connection, not the server, and may saturate the connection causing a
|
||
|
degradation in service for other users.
|
||
|
|
||
|
Make sure there is no stateful firewall between the server and the Internet,
|
||
|
because most of them can't handle the amount of UDP traffic the test will
|
||
|
generate and will end up dropping packets, skewing the test results. Some
|
||
|
will even lock up or crash.
|
||
|
|
||
|
You should run resperf on a machine separate from the server under test, on
|
||
|
the same LAN. Preferably, this should be a Gigabit Ethernet network. The
|
||
|
machine running resperf should be at least as fast as the machine being
|
||
|
tested; otherwise, it may end up being the bottleneck.
|
||
|
|
||
|
There should be no other applications running on the machine running
|
||
|
resperf. Performance testing at the traffic levels involved is essentially a
|
||
|
hard real-time application - consider the fact that at a query rate of
|
||
|
100,000 queries per second, if resperf gets delayed by just 1/100 of a
|
||
|
second, 1000 incoming UDP packets will arrive in the meantime. This is more
|
||
|
than most operating systems will buffer, which means packets will be
|
||
|
dropped.
|
||
|
|
||
|
Because the granularity of the timers provided by operating systems is
|
||
|
typically too coarse to accurately schedule packet transmissions at
|
||
|
sub-millisecond intervals, resperf will busy-wait between packet
|
||
|
transmissions, constantly polling for responses in the meantime. Therefore,
|
||
|
it is normal for resperf to consume 100% CPU during the whole test run, even
|
||
|
during periods where query rates are relatively low.
|
||
|
|
||
|
You will also need a set of test queries in the \fBdnsperf\fR file format.
|
||
|
See the \fBdnsperf\fR man page for instructions on how to construct this
|
||
|
query file. To make the test as realistic as possible, the queries should be
|
||
|
derived from recorded production client DNS traffic, without removing
|
||
|
duplicate queries or other filtering. With the default settings, resperf
|
||
|
will use up to 3 million queries in each test run.
|
||
|
|
||
|
If the caching server to be tested has a configurable limit on the number of
|
||
|
simultaneous resolutions, like the \fBmax\-recursive\-clients\fR statement
|
||
|
in Nominum CacheServe or the \fBrecursive\-clients\fR option in BIND 9, you
|
||
|
will probably have to increase it. As a starting point, we recommend a value
|
||
|
of 10000 for Nominum CacheServe and 100000 for BIND 9. Should the limit be
|
||
|
reached, it will show up in the plots as an increase in the number of
|
||
|
failure responses.
|
||
|
|
||
|
The server being tested should be restarted at the beginning of each test to
|
||
|
make sure it is starting with an empty cache. If the cache already contains
|
||
|
data from a previous test run that used the same set of queries, almost all
|
||
|
queries will be answered from the cache, yielding inflated performance
|
||
|
numbers.
|
||
|
|
||
|
To use the \fBresperf\-report\fR script, you need to have \fBgnuplot\fR
|
||
|
installed. Make sure your installed version of \fBgnuplot\fR supports the
|
||
|
png terminal driver. If your \fBgnuplot\fR doesn't support png but does
|
||
|
support gif, you can change the line saying terminal=png in the
|
||
|
\fBresperf\-report\fR script to terminal=gif.
|
||
|
.SS "Running the test"
|
||
|
Resperf is typically invoked via the \fBresperf\-report\fR script, which
|
||
|
will run \fBresperf\fR with its output redirected to a file and then
|
||
|
automatically generate an illustrated report in HTML format. Command line
|
||
|
arguments given to resperf-report will be passed on unchanged to resperf.
|
||
|
|
||
|
When running resperf-report, you will need to specify at least the server IP
|
||
|
address and the query data file. A typical invocation will look like
|
||
|
.RS
|
||
|
.hy 0
|
||
|
|
||
|
.nf
|
||
|
resperf\-report \-s 10.0.0.2 \-d queryfile
|
||
|
.fi
|
||
|
.hy
|
||
|
.RE
|
||
|
|
||
|
With default settings, the test run will take at most 100 seconds (60
|
||
|
seconds of ramping up traffic and then 40 seconds of waiting for responses),
|
||
|
but in practice, the 60-second traffic phase will usually be cut short. To
|
||
|
be precise, resperf can transition from the traffic-sending phase to the
|
||
|
waiting-for-responses phase in three different ways:
|
||
|
.IP \(bu 2
|
||
|
Running for the full allotted time and successfully reaching the maximum
|
||
|
query rate (by default, 60 seconds and 100,000 qps, respectively). Since
|
||
|
this is a very high query rate, this will rarely happen (with today's
|
||
|
hardware); one of the other two conditions listed below will usually occur
|
||
|
first.
|
||
|
.IP \(bu 2
|
||
|
Exceeding 65,536 outstanding queries. This often happens as a result of
|
||
|
(successfully) exceeding the capacity of the server being tested, causing
|
||
|
the excess queries to be dropped. The limit of 65,536 queries comes from the
|
||
|
number of possible values for the ID field in the DNS packet. Resperf needs
|
||
|
to allocate a unique ID for each outstanding query, and is therefore unable
|
||
|
to send further queries if the set of possible IDs is exhausted.
|
||
|
.IP \(bu 2
|
||
|
When resperf finds itself unable to send queries fast enough. Resperf will
|
||
|
notice if it is falling behind in its scheduled query transmissions, and if
|
||
|
this backlog reaches 1000 queries, it will print a message like "Fell behind
|
||
|
by 1000 queries" (or whatever the actual number is at the time) and stop
|
||
|
sending traffic.
|
||
|
.PP
|
||
|
Regardless of which of the above conditions caused the traffic-sending phase
|
||
|
of the test to end, you should examine the resulting plots to make sure the
|
||
|
server's response rate is flattening out toward the end of the test. If it
|
||
|
is not, then you are not loading the server enough. If you are getting the
|
||
|
"Fell behind" message, make sure that the machine running resperf is fast
|
||
|
enough and has no other applications running.
|
||
|
|
||
|
You should also monitor the CPU usage of the server under test. It should
|
||
|
reach close to 100% CPU at the point of maximum traffic; if it does not, you
|
||
|
most likely have a bottleneck in some other part of your test setup, for
|
||
|
example, your external Internet connection.
|
||
|
|
||
|
The report generated by \fBresperf\-report\fR will be stored with a unique
|
||
|
file name based on the current date and time, e.g.,
|
||
|
\fI20060812-1550.html\fR. The PNG images of the plots and other auxiliary
|
||
|
files will be stored in separate files beginning with the same date-time
|
||
|
string. To view the report, simply open the \fI.html\fR file in a web
|
||
|
browser.
|
||
|
|
||
|
If you need to copy the report to a separate machine for viewing, make sure
|
||
|
to copy the .png files along with the .html file (or simply copy all the
|
||
|
files, e.g., using scp 20060812-1550.* host:directory/).
|
||
|
.SS "Interpreting the report"
|
||
|
The \fI.html\fR file produced by \fBresperf\-report\fR consists of two
|
||
|
sections. The first section, "Resperf output", contains output from the
|
||
|
\fBresperf\fR program such as progress messages, a summary of the command
|
||
|
line arguments, and summary statistics. The second section, "Plots",
|
||
|
contains two plots generated by \fBgnuplot\fR: "Query/response/failure rate"
|
||
|
and "Latency".
|
||
|
|
||
|
The "Query/response/failure rate" plot contains three graphs. The "Queries
|
||
|
sent per second" graph shows the amount of traffic being sent to the server;
|
||
|
this should be very close to a straight diagonal line, reflecting the linear
|
||
|
ramp-up of traffic.
|
||
|
|
||
|
The "Total responses received per second" graph shows how many of the
|
||
|
queries received a response from the server. All responses are counted,
|
||
|
whether successful (NOERROR or NXDOMAIN) or not (e.g., SERVFAIL).
|
||
|
|
||
|
The "Failure responses received per second" graph shows how many of the
|
||
|
queries received a failure response. A response is considered to be a
|
||
|
failure if its RCODE is neither NOERROR nor NXDOMAIN.
|
||
|
|
||
|
By visually inspecting the graphs, you can get an idea of how the server
|
||
|
behaves under increasing load. The "Total responses received per second"
|
||
|
graph will initially closely follow the "Queries sent per second" graph
|
||
|
(often rendering it invisible in the plot as the two graphs are plotted on
|
||
|
top of one another), but when the load exceeds the server's capacity, the
|
||
|
"Total responses received per second" graph may diverge from the "Queries
|
||
|
sent per second" graph and flatten out, indicating that some of the queries
|
||
|
are being dropped.
|
||
|
|
||
|
The "Failure responses received per second" graph will normally show a
|
||
|
roughly linear ramp close to the bottom of the plot with some random
|
||
|
fluctuation, since typical query traffic will contain some small percentage
|
||
|
of failing queries randomly interspersed with the successful ones. As the
|
||
|
total traffic increases, the number of failures will increase
|
||
|
proportionally.
|
||
|
|
||
|
If the "Failure responses received per second" graph turns sharply upwards,
|
||
|
this can be another indication that the load has exceeded the server's
|
||
|
capacity. This will happen if the server reacts to overload by sending
|
||
|
SERVFAIL responses rather than by dropping queries. Since Nominum CacheServe
|
||
|
and BIND 9 will both respond with SERVFAIL when they exceed their
|
||
|
\fBmax\-recursive\-clients\fR or \fBrecursive\-clients\fR limit,
|
||
|
respectively, a sudden increase in the number of failures could mean that
|
||
|
the limit needs to be increased.
|
||
|
|
||
|
The "Latency" plot contains a single graph marked "Average latency". This
|
||
|
shows how the latency varies during the course of the test. Typically, the
|
||
|
latency graph will exhibit a downwards trend because the cache hit rate
|
||
|
improves as ever more responses are cached during the test, and the latency
|
||
|
for a cache hit is much smaller than for a cache miss. The latency graph is
|
||
|
provided as an aid in determining the point where the server gets
|
||
|
overloaded, which can be seen as a sharp upwards turn in the graph. The
|
||
|
latency graph is not intended for making absolute latency measurements or
|
||
|
comparisons between servers; the latencies shown in the graph are not
|
||
|
representative of production latencies due to the initially empty cache and
|
||
|
the deliberate overloading of the server towards the end of the test.
|
||
|
|
||
|
Note that all measurements are displayed on the plot at the horizontal
|
||
|
position corresponding to the point in time when the query was sent, not
|
||
|
when the response (if any) was received. This makes it it easy to compare
|
||
|
the query and response rates; for example, if no queries are dropped, the
|
||
|
query and response graphs will be identical. As another example, if the plot
|
||
|
shows 10% failure responses at t=5 seconds, this means that 10% of the
|
||
|
queries sent at t=5 seconds eventually failed, not that 10% of the responses
|
||
|
received at t=5 seconds were failures.
|
||
|
.SS "Determining the server's maximum throughput"
|
||
|
Often, the goal of running \fBresperf\fR is to determine the server's
|
||
|
maximum throughput, in other words, the number of queries per second it is
|
||
|
capable of handling. This is not always an easy task, because as a server is
|
||
|
driven into overload, the service it provides may deteriorate gradually, and
|
||
|
this deterioration can manifest itself either as queries being dropped, as
|
||
|
an increase in the number of SERVFAIL responses, or an increase in latency.
|
||
|
The maximum throughput may be defined as the highest level of traffic at
|
||
|
which the server still provides an acceptable level of service, but that
|
||
|
means you first need to decide what an acceptable level of service means in
|
||
|
terms of packet drop percentage, SERVFAIL percentage, and latency.
|
||
|
|
||
|
The summary statistics in the "Resperf output" section of the report
|
||
|
contains a "Maximum throughput" value which by default is determined from
|
||
|
the maximum rate at which the server was able to return responses, without
|
||
|
regard to the number of queries being dropped or failing at that point. This
|
||
|
method of throughput measurement has the advantage of simplicity, but it may
|
||
|
or may not be appropriate for your needs; the reported value should always
|
||
|
be validated by a visual inspection of the graphs to ensure that service has
|
||
|
not already deteriorated unacceptably before the maximum response rate is
|
||
|
reached. It may also be helpful to look at the "Lost at that point" value in
|
||
|
the summary statistics; this indicates the percentage of the queries that
|
||
|
was being dropped at the point in the test when the maximum throughput was
|
||
|
reached.
|
||
|
|
||
|
Alternatively, you can make resperf report the throughput at the point in
|
||
|
the test where the percentage of queries dropped exceeds a given limit (or
|
||
|
the maximum as above if the limit is never exceeded). This can be a more
|
||
|
realistic indication of how much the server can be loaded while still
|
||
|
providing an acceptable level of service. This is done using the \fB\-L\fR
|
||
|
command line option; for example, specifying \fB\-L 10\fR makes resperf
|
||
|
report the highest throughput reached before the server starts dropping more
|
||
|
than 10% of the queries.
|
||
|
|
||
|
There is no corresponding way of automatically constraining results based on
|
||
|
the number of failed queries, because unlike dropped queries, resolution
|
||
|
failures will occur even when the the server is not overloaded, and the
|
||
|
number of such failures is heavily dependent on the query data and network
|
||
|
conditions. Therefore, the plots should be manually inspected to ensure that
|
||
|
there is not an abnormal number of failures.
|
||
|
.SH "GENERATING CONSTANT TRAFFIC"
|
||
|
In addition to ramping up traffic linearly, \fBresperf\fR also has the
|
||
|
capability to send a constant stream of traffic. This can be useful when
|
||
|
using \fBresperf\fR for tasks other than performance measurement; for
|
||
|
example, it can be used to "soak test" a server by subjecting it to a
|
||
|
sustained load for an extended period of time.
|
||
|
|
||
|
To generate a constant traffic load, use the \fB\-c\fR command line option,
|
||
|
together with the \fB\-m\fR option which specifies the desired constant
|
||
|
query rate. For example, to send 10000 queries per second for an hour, use
|
||
|
\fB\-m 10000 \-c 3600\fR. This will include the usual 30-second gradual
|
||
|
ramp-up of traffic at the beginning, which may be useful to avoid initially
|
||
|
overwhelming a server that is starting with an empty cache. To start the
|
||
|
onslaught of traffic instantly, use \fB\-m 10000 \-c 3600 \-r 0\fR.
|
||
|
|
||
|
To be precise, \fBresperf\fR will do a linear ramp-up of traffic from 0 to
|
||
|
\fB\-m\fR queries per second over a period of \fB\-r\fR seconds, followed by
|
||
|
a plateau of steady traffic at \fB\-m\fR queries per second lasting for
|
||
|
\fB\-c\fR seconds, followed by waiting for responses for an extra 40
|
||
|
seconds. Either the ramp-up or the plateau can be suppressed by supplying a
|
||
|
duration of zero seconds with \fB\-r 0\fR and \fB\-c 0\fR, respectively. The
|
||
|
latter is the default.
|
||
|
|
||
|
Sending traffic at high rates for hours on end will of course require very
|
||
|
large amounts of input data. Also, a long-running test will generate a large
|
||
|
amount of plot data, which is kept in memory for the duration of the test.
|
||
|
To reduce the memory usage and the size of the plot file, consider
|
||
|
increasing the interval between measurements from the default of 0.5 seconds
|
||
|
using the \fB\-i\fR option in long-running tests.
|
||
|
|
||
|
When using \fBresperf\fR for long-running tests, it is important that the
|
||
|
traffic rate specified using the \fB\-m\fR is one that both \fBresperf\fR
|
||
|
itself and the server under test can sustain. Otherwise, the test is likely
|
||
|
to be cut short as a result of either running out of query IDs (because of
|
||
|
large numbers of dropped queries) or of resperf falling behind its
|
||
|
transmission schedule.
|
||
|
.SH OPTIONS
|
||
|
Because the \fBresperf\-report\fR script passes its command line options
|
||
|
directly to the \fBresperf\fR programs, they both accept the same set of
|
||
|
options, with one exception: \fBresperf\-report\fR automatically adds an
|
||
|
appropriate \fB\-P\fR to the \fBresperf\fR command line, and therefore does
|
||
|
not itself take a \fB\-P\fR option.
|
||
|
|
||
|
\fB-d \fIdatafile\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the input data file. If not specified, \fBresperf\fR will read
|
||
|
from standard input.
|
||
|
.RE
|
||
|
|
||
|
\fB-M \fImode\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the transport mode to use, "udp", "tcp" or "tls". Default is "udp".
|
||
|
.RE
|
||
|
|
||
|
\fB-s \fIserver_addr\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the name or address of the server to which requests will be sent.
|
||
|
The default is the loopback address, 127.0.0.1.
|
||
|
.RE
|
||
|
|
||
|
\fB-p \fIport\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Sets the port on which the DNS packets are sent. If not specified, the
|
||
|
standard DNS port (udp/tcp 53, tls 853) is used.
|
||
|
.RE
|
||
|
|
||
|
\fB-a \fIlocal_addr\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the local address from which to send requests. The default is the
|
||
|
wildcard address.
|
||
|
.RE
|
||
|
|
||
|
\fB-x \fIlocal_port\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the local port from which to send requests. The default is the
|
||
|
wildcard port (0).
|
||
|
|
||
|
If acting as multiple clients and the wildcard port is used, each client
|
||
|
will use a different random port. If a port is specified, the clients will
|
||
|
use a range of ports starting with the specified one.
|
||
|
.RE
|
||
|
|
||
|
\fB-t \fItimeout\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the request timeout value, in seconds. \fBresperf\fR will no
|
||
|
longer wait for a response to a particular request after this many seconds
|
||
|
have elapsed. The default is 45 seconds.
|
||
|
|
||
|
\fBresperf\fR times out unanswered requests in order to reclaim query IDs so
|
||
|
that the query ID space will not be exhausted in a long-running test, such
|
||
|
as when "soak testing" a server for an day with \fB\-m 10000 \-c 86400\fR.
|
||
|
The timeouts and the ability to tune them are of little use in the more
|
||
|
typical use case of a performance test lasting only a minute or two.
|
||
|
|
||
|
The default timeout of 45 seconds was chosen to be longer than the query
|
||
|
timeout of current caching servers. Note that this is longer than the
|
||
|
corresponding default in \fBdnsperf\fR, because caching servers can take
|
||
|
many orders of magnitude longer to answer a query than authoritative servers
|
||
|
do.
|
||
|
|
||
|
If a short timeout is used, there is a possibility that \fBresperf\fR will
|
||
|
receive a response after the corresponding request has timed out; in this
|
||
|
case, a message like Warning: Received a response with an unexpected id: 141
|
||
|
will be printed.
|
||
|
.RE
|
||
|
|
||
|
\fB-b \fIbufsize\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Sets the size of the socket's send and receive buffers, in kilobytes. If not
|
||
|
specified, the operating system's default is used.
|
||
|
.RE
|
||
|
|
||
|
\fB-f \fIfamily\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the address family used for sending DNS packets. The possible
|
||
|
values are "inet", "inet6", or "any". If "any" (the default value) is
|
||
|
specified, \fBresperf\fR will use whichever address family is appropriate
|
||
|
for the server it is sending packets to.
|
||
|
.RE
|
||
|
|
||
|
\fB-e\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Enables EDNS0 [RFC2671], by adding an OPT record to all packets sent.
|
||
|
.RE
|
||
|
|
||
|
\fB-D\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Sets the DO (DNSSEC OK) bit [RFC3225] in all packets sent. This also enables
|
||
|
EDNS0, which is required for DNSSEC.
|
||
|
.RE
|
||
|
|
||
|
\fB-y \fI[alg:]name:secret\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Add a TSIG record [RFC2845] to all packets sent, using the specified TSIG
|
||
|
key algorithm, name and secret, where the algorithm defaults to hmac-md5 and
|
||
|
the secret is expressed as a base-64 encoded string.
|
||
|
.RE
|
||
|
|
||
|
\fB-h\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Print a usage statement and exit.
|
||
|
.RE
|
||
|
|
||
|
\fB-i \fIinterval\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the time interval between data points in the plot file. The
|
||
|
default is 0.5 seconds.
|
||
|
.RE
|
||
|
|
||
|
\fB-m \fImax_qps\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the target maximum query rate (in queries per second). This should
|
||
|
be higher than the expected maximum throughput of the server being tested.
|
||
|
Traffic will be ramped up at a linearly increasing rate until this value is
|
||
|
reached, or until one of the other conditions described in the section
|
||
|
"Running the test" occurs. The default is 100000 queries per second.
|
||
|
.RE
|
||
|
|
||
|
\fB-P \fIplot_data_file\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the name of the plot data file. The default is
|
||
|
\fIresperf.gnuplot\fR.
|
||
|
.RE
|
||
|
|
||
|
\fB-r \fIrampup_time\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the length of time over which traffic will be ramped up. The
|
||
|
default is 60 seconds.
|
||
|
.RE
|
||
|
|
||
|
\fB-c \fIconstant_traffic_time\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the length of time for which traffic will be sent at a constant
|
||
|
rate following the initial ramp-up. The default is 0 seconds, meaning no
|
||
|
sending of traffic at a constant rate will be done.
|
||
|
.RE
|
||
|
|
||
|
\fB-L \fImax_loss\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Specifies the maximum acceptable query loss percentage for purposes of
|
||
|
determining the maximum throughput value. The default is 100%, meaning that
|
||
|
\fBresperf\fR will measure the maximum throughput without regard to query
|
||
|
loss.
|
||
|
.RE
|
||
|
|
||
|
\fB-C \fIclients\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Act as multiple clients. Requests are sent from multiple sockets. The
|
||
|
default is to act as 1 client.
|
||
|
.RE
|
||
|
|
||
|
\fB-q \fImax_outstanding\fB\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Sets the maximum number of outstanding requests. \fBresperf\fR will stop
|
||
|
ramping up traffic when this many queries are outstanding. The default is
|
||
|
64k, and the limit is 64k per client.
|
||
|
.RE
|
||
|
|
||
|
\fB-v\fR
|
||
|
.br
|
||
|
.RS
|
||
|
Enables verbose mode to report about network readiness and congestion.
|
||
|
.RE
|
||
|
.SH "THE PLOT DATA FILE"
|
||
|
The plot data file is written by the \fBresperf\fR program and contains the
|
||
|
data to be plotted using \fBgnuplot\fR. When running \fBresperf\fR via the
|
||
|
\fBresperf\-report\fR script, there is no need for the user to deal with
|
||
|
this file directly, but its format and contents are documented here for
|
||
|
completeness and in case you wish to run \fBresperf\fR directly and use its
|
||
|
output for purposes other than viewing it with \fBgnuplot\fR.
|
||
|
|
||
|
The first line of the file is a comment identifying the fields. It may be
|
||
|
recognized as a comment by its leading hash sign (#).
|
||
|
|
||
|
Subsequent lines contain the actual plot data. For purposes of generating
|
||
|
the plot data file, the test run is divided into time intervals of 0.5
|
||
|
seconds (or some other length of time specified with the \fB\-i\fR command
|
||
|
line option). Each line corresponds to one such interval, and contains the
|
||
|
following values as floating-point numbers:
|
||
|
|
||
|
\fBTime\fR
|
||
|
.br
|
||
|
.RS
|
||
|
The midpoint of this time interval, in seconds since the beginning of the
|
||
|
run
|
||
|
.RE
|
||
|
|
||
|
\fBTarget queries per second\fR
|
||
|
.br
|
||
|
.RS
|
||
|
The number of queries per second scheduled to be sent in this time interval
|
||
|
.RE
|
||
|
|
||
|
\fBActual queries per second\fR
|
||
|
.br
|
||
|
.RS
|
||
|
The number of queries per second actually sent in this time interval
|
||
|
.RE
|
||
|
|
||
|
\fBResponses per second\fR
|
||
|
.br
|
||
|
.RS
|
||
|
The number of responses received corresponding to queries sent in this time
|
||
|
interval, divided by the length of the interval
|
||
|
.RE
|
||
|
|
||
|
\fBFailures per second\fR
|
||
|
.br
|
||
|
.RS
|
||
|
The number of responses received corresponding to queries sent in this time
|
||
|
interval and having an RCODE other than NOERROR or NXDOMAIN, divided by the
|
||
|
length of the interval
|
||
|
.RE
|
||
|
|
||
|
\fBAverage latency\fR
|
||
|
.br
|
||
|
.RS
|
||
|
The average time between sending the query and receiving a response, for
|
||
|
queries sent in this time interval
|
||
|
.RE
|
||
|
.SH "SEE ALSO"
|
||
|
\fBdnsperf\fR(1)
|
||
|
.SH AUTHOR
|
||
|
Nominum, Inc.
|
||
|
.LP
|
||
|
Maintained by DNS-OARC
|
||
|
.LP
|
||
|
.RS
|
||
|
.I https://www.dns-oarc.net/
|
||
|
.RE
|
||
|
.LP
|
||
|
.SH BUGS
|
||
|
For issues and feature requests please use:
|
||
|
.LP
|
||
|
.RS
|
||
|
\fI@PACKAGE_URL@\fP
|
||
|
.RE
|
||
|
.LP
|
||
|
For question and help please use:
|
||
|
.LP
|
||
|
.RS
|
||
|
\fI@PACKAGE_BUGREPORT@\fP
|
||
|
.RE
|
||
|
.LP
|