goreplay构建测试环境
goreplay简介 官方网站 github地址
GoReplay is the simplest and safest way to test your app using real traffic before you put it into production.
As your application grows, the effort required to test it also grows exponentially. GoReplay offers you the simple idea of reusing your existing traffic for testing, which makes it incredibly powerful. Our state of art technique allows to analyze and record your application traffic without affecting it. This eliminates the risks that come with putting a third party component in the critical path.
GoReplay increases your confidence in code deployments, configuration changes and infrastructure changes. Did we mention that no coding is required?
Here is basic workflow: The listener server catches http traffic and sends it to the replay server or saves to file. The replay server forwards traffic to a given address.
Installation
Download latest binary from https://github.com/buger/gor/releases or compile by yourself.
这里用方法1进行操作。下面是详细步骤
1、下载goreplay最新发布版本
https://github.com/buger/gor/releases
目前的最新版本是v0.16.0。这里我们下载最新的linux环境版本 gor_0.16.0_x64.tar.gz。
2、将下载好的文件解压
tar -zxf gor_0.16.0_x64.tar.gz
Getting started
The most basic setup will be
sudo ./gor --input-raw :8000 --output-stdout
which acts like tcpdump. If you already have test environment you can start replaying:
sudo ./gor --input-raw :8000 --output-http http://staging.env.
See the our documentation and Getting started page for more info.
1、测试环境准备
假设正式环境App部署端口为 8080,测试环境部署端口为 8090(同一台机器),均正常运行。
2、测试goreplay是否可以正常运行。
启动gor,监听正式环境http请求,并列印到日志中
sudo ./gor --input-raw :8080 --output-stdout
在浏览器中点击正式环境链接,在goreplay的控制台输出为:
正式运行,有以下两种模式可供选择
1、Replaying
Now it’s time to replay your original traffic to another environment. Let’s start the same file web server but on a different port: gor file-server :8001.
Instead of –output-stdout we will use –output-http and provide URL of second server:
sudo ./gor --input-raw :8000 --output-http="http://localhost:8001"
2、Saving requests to file and replaying them later
Sometimes it’s not possible to replay requests in real time; Gor allows you to save requests to the file and replay them later.
First use –output-file to save them:
sudo ./gor --input-raw :8000 --output-file=requests.gor
This will create new file and continuously write all captured requests to it.
Let’s re-run Gor, but now to replay requests from file:
./gor --input-file requests.gor --output-http="http://localhost:8001"
You should see all the recorded requests coming to the second server, and they will be replayed in the same order and with exactly same timing as they were recorded.
这里用Replaying进行操作。下面是详细步骤
1、运行goreplay
sudo ./gor --input-raw :8080 --output-http="http://localhost:8090"
2、查看app1、app2的运行日志
可以看到app1和app2都被访问了,但是app1的正常访问不受任何影响。
3、后台自动运行goreplay
sudo nohup ./gor --input-raw :8080 --output-http="http://localhost:8090" &
tail -500f nohup.out
4、不使用root用户运行 Running as non root user
You can enable Gor for non-root users in a secure method by using the following commands
# Following commands assume that you put `gor` binary to /usr/local/bin
add gor
addgroup <username> gor
chgrp gor /usr/local/bin/gor
chmod 0750 /usr/local/bin/gor
setcap "cap_net_raw,cap_net_admin+eip" /usr/local/bin/gor
As a brief explanation of the above.
- We create a group called gor.
- We then add the user you want to the new group so they will be able to use gor without sudo
- We then change the user/group of gor binary the new group.
- We then make sure the permissions are set on gor binary so that members of the group can execute it but other normal users cannot.
- We then use setcap to give the CAP_NET_RAW and CAP_NET_ADMIN privilege to the executable when it runs. This is so that Gor can open its raw socket which is not normally permitted unless you are root.
上述大意为
- 创建一个组gor
- 将对应用户添加到gor组
- 授权
下面是详细步骤
1、创建gor组
groupadd gor
2、将需要执行的用户添加到gor组
将ubuntu用户加入gor组
addgroup ubuntu gor
3、授权
编译好的gor二进制文件所在路径为:/usr/local/bin/gor
chgrp gor /usr/local/bin/gor
chmod 0750 /usr/local/bin/gor
setcap "cap_net_raw,cap_net_admin+eip" /usr/local/bin/gor
4、运行
切换到ubuntu用户
su ubuntu
切换到gor组
newgrp gor
进入gor目录
cd /usr/local/bin
运行gor
./gor --input-raw :8000 --output-stdout
goreplay高级进阶 更多详情
1、保存到文件并从文件中转发 Saving and Replaying from file
You can save requests to file, and replay them later. While replaying it will preserve the original time differences between requests. If you apply “percentage based limiting”、”Rate Limiting” timing between requests will be reduced or increased appropriately: this approach opens possibilities like load testing, see below.
# write to file
gor --input-raw :80 --output-file requests.log
# read from file
gor --input-file requests.gor --output-http "http://staging.com"
By default Gor writes files in chunks. This configurable using –output-file-append option: the flushed chunk is appended to existence file or not. The default is false. By default, –output-file flushes each chunk to a different path.
gor ... --output-file %Y%m%d.log
# append false
20140608_0.log
20140608_1.log
20140609_0.log
20140609_1.log
This makes parallel file processing easy. But if you want to disable this behavior, you can disable it by adding --output-file-append
option:
gor ... --output-file %Y%m%d.log --output-file-append
# append true
20140608.log
20140609.log
If you run gor multiple times, and it finds existing files, it will continue from last known index.
Chunk size
You can set chunk limits using --output-file-size-limit
and --output-file-queue-limit
options. The length of the chunk queue and the size of each chunk, respectively. The default values are 256 and 32mb, respectively. The suffixes “k” (KB), “m” (MB), and “g” (GB) can be used for output-file-size-limit
. If you want to have only size constraint, you can set --output-file-queue-limit
to 0, and vice versa.
gor --input-raw :80 --output-file %Y-%m-%d.gz --output-file-size-limit 256m --output-file-queue-limit 0
Using date variables in file names
For example, you can tell to create new file each hour: --output-file /mnt/logs/requests-%Y-%m-%d-%H.log
It will create new file for each hour: requests-2016-06-01-12.log, requests-2016-06-01-13.log, …
The time format used as part of the file name. The following characters are replaced with actual values when the file is created:
%Y: year including the century (at least 4 digits)
%m: month of the year (01..12)
%d: Day of the month (01..31)
%H: Hour of the day, 24-hour clock (00..23)
%M: Minute of the hour (00..59)
%S: Second of the minute (00..60)
The default format is %Y%m%d%H, which creates one file per hour.
GZIP compression
To read or write GZIP compressed files ensure that file extension ends with “.gz”: --output-file log.gz
Replaying from multiple files
--input-file
accepts file pattern, for example: --input-file logs-2016-05-*
: it will replay all the files, sorting them in lexicographical order.
2、性能测试 Performance testing
Currently, this functionality supported only by input-file
and only when using percentage based limiter. Unlike default limiter for input-file
instead of dropping requests it will slowdown or speedup request emitting. Note that limiter is applied to input:
# Replay from file on 2x speed
gor --input-file "requests.gor|200%" --output-http "staging.com"
Use --stats --output-http-stats
to see latency stats.
Looping files for replaying indefinitely
You can loop the same set of files, so when the last one replays all the requests, it will not stop, and will start from first one again. Having the only small amount of requests you can do extensive performance testing. Pass --input-file-loop
to make it work.
3、限流 Rate limiting
Rate limiting can be useful if you only want to forward parts of incoming traffic, for example, to not overload your test environment. There are two strategies: dropping random requests or dropping fractions of requests based on Header or URL param value.
Dropping random requests
Every input and output support random rate limiting. There are two limiting algorithms: absolute or percentage based.
Absolute: If for current second it reached specified requests limit - disregard the rest, on next second counter reset.
Percentage: For input-file it will slowdown or speedup request execution, for the rest it will use the random generator to decide if request pass or not based on the chance you specified.
You can specify your desired limit using the “|” operator after the server address, see examples below.
Limiting replay using absolute number
# staging.server will not get more than ten requests per second
gor --input-tcp :28020 --output-http "http://staging.com|10"
Limiting listener using percentage based limiter
# replay server will not get more than 10% of requests
# useful for high-load environments
gor --input-raw :80 --output-tcp "replay.local:28020|10%"
Consistent limiting based on Header or URL param value
If you have unique user id (like API key) stored in header or URL you can consistently forward specified percent of traffic only for the fraction of this users. Basic formula looks like this: FNV32-1A_hashing(value) % 100 >= chance
. Examples:
# Limit based on header value
gor --input-raw :80 --output-tcp "replay.local:28020|10%" --http-header-limiter "X-API-KEY: 10%"
# Limit based on header value
gor --input-raw :80 --output-tcp "replay.local:28020|10%" --http-param-limiter "api_key: 10%"
When limiting based on header or param only percentage based limiting supported.
4、请求过滤 Request filtering
Filtering is useful when you need to capture only specific part of traffic, like API requests. It is possible to filter by URL, HTTP header or HTTP method.
Allow url regexp
# only forward requests being sent to the /api endpoint
gor --input-raw :8080 --output-http staging.com --http-allow-url /api
Disallow url regexp
# only forward requests NOT being sent to the /api... endpoint
gor --input-raw :8080 --output-http staging.com --http-disallow-url /api
Filter based on regexp of header
# only forward requests with an api version of 1.0x
gor --input-raw :8080 --output-http staging.com --http-allow-header api-version:^1\.0\d
# only forward requests NOT containing User-Agent header value "Replayed by Gor"
gor --input-raw :8080 --output-http staging.com --http-disallow-header "User-Agent: Replayed by Gor"
Filter based on HTTP method
Requests not matching a specified whitelist can be filtered out. For example to strip non-nullipotent requests:
gor --input-raw :80 --output-http "http://staging.server" \
--http-allow-method GET \
--http-allow-method OPTIONS
5、请求重写 Request rewriting
Gor supports rewriting of URLs, URL params and headers, see below.
Rewriting may be useful if you test environment does not have the same data as your production, and you want to perform all actions in the context of test
user: for example rewrite all API tokens to some test value. Other possible use cases are toggling features on/off using custom headers or rewriting URL’s if they changed in the new environment.
For more complex logic you can use Middleware.
Rewrite URL based on a mapping
--http-rewrite-url
expects value in “:” format: “:” is a dilimiter. In <replace>
section you may use captured regexp group values. This works similar to replace
method in Javascript or gsub
in Ruby.
# Rewrites all `/v1/user/<user_id>/ping` requests to `/v2/user/<user_id>/ping`
gor --input-raw :8080 --output-http staging.com --http-rewrite-url /v1/user/([^\\/]+)/ping:/v2/user/$1/ping
Set URL param
Set request url param, if param already exists it will be overwritten.
gor --input-raw :8080 --output-http staging.com --http-set-param api_key=1
Set Header
Set request header, if header already exists it will be overwritten. May be useful if you need to identify requests generated by Gor or enable feature flagged functionality in an application:
gor --input-raw :80 --output-http "http://staging.server" \
--http-header "User-Agent: Replayed by Gor" \
--http-header "Enable-Feature-X: true"
Host header
Host header gets special treatment. By default Host get set to the value specified in –output-http. If you manually set –http-header “Host: anonther.com”, Gor will not override Host value.
If you app accepts traffic from multiple domains, and you want to keep original headers, there is specific --http-original-host
with tells Gor do not touch Host header at all.
6、中间件 Middleware
Overview
Middleware is a program that accepts request and response payload at STDIN and emits modified requests at STDOUT. You can implement any custom logic like stripping private data, advanced rewriting, support for oAuth and etc. Check examples included into our repo.
Original request +--------------+
+-------------+----------STDIN---------->+ |
| Gor input | | Middleware |
+-------------+----------STDIN---------->+ |
Original response (1) +------+---+---+
| ^
+-------------+ Modified request v |
| Gor output +<---------STDOUT-----------------+ |
+-----+-------+ |
| |
| Replayed response |
+------------------STDIN----------------->----+
(1): Original responses will only be sent to the middleware if the --input-raw-track-response
option is specified.
Middleware can be written in any language, see examples/middleware
folder for examples. Middleware program should accept the fact that all communication with Gor is asynchronous, there is no guarantee that original request and response messages will come one after each other. Your app should take care of the state if logic depends on original or replayed response, see examples/middleware/token_modifier.go
as example.
Simple bash echo middleware (returns same request) will look like this:
while read line; do
echo $line
end
Middleware can be enabled using --middleware
option, by specifying path to executable file:
gor --input-raw :80 --middleware "/opt/middleware_executable" --output-http "http://staging.server"
Communication protocol
All messages should be hex encoded, new line character specifieds the end of the message, eg. new message per line.
Decoded payload consist of 2 parts: header and HTTP payload, separated by new line character.
Example request payload:
1 932079936fa4306fc308d67588178d17d823647c 1439818823587396305
GET /a HTTP/1.1
Host: 127.0.0.1
Example response payload (note: you will only receive this if you specify --input-raw-track-response
)
2 8e091765ae902fef8a2b7d9dd960e9d52222bd8c 1439818823587996305 2782013
HTTP/1.1 200 OK
Date: Mon, 17 Aug 2015 13:40:23 GMT
Content-Length: 0
Content-Type: text/plain; charset=utf-8
Header contains request meta information separated by spaces. First value is payload type, possible values: 1
- request, 2
- original response, 3
- replayed response. Next goes request id: unique among all requests (sha1 of time and Ack), but remain same for original and replayed response, so you can create associations between request and responses. The third argument is the time when request/response was initiated/received. Forth argument is populated only for responses and means latency.
HTTP payload is unmodified HTTP requests/responses intercepted from network. You can read more about request format here, here and here. You can operate with payload as you want, add headers, change path, and etc. Basically you just editing a string, just ensure that it is RCF compliant.
At the end modified (or untouched) request should be emitted back to STDOUT, keeping original header, and hex-encoded. If you want to filter request, just not send it. Emitting responses back is required, even if you did not touch them.
Advanced example
Imagine that you have auth system that randomly generate access tokens, which used later for accessing secure content. Since there is no pre-defined token value, naive approach without middleware (or if middleware use only request payloads) will fail, because replayed server have own tokens, not synced with origin. To fix this, our middleware should take in account responses of replayed and origin server, store originalToken -> replayedToken
aliases and rewrite all requests using this token to use replayed alias. See examples/middleware/token_modifier.go and middleware_test.go#TestTokenMiddleware as example of described scheme.
7、分布式部署 Distributed configuration
Sometimes it makes sense to use separate Gor instance for replaying traffic and performing things like load testing, so your production machines do not spend precious resources. It is possible to configure Gor on your web machines forward traffic to Gor aggregator instance running on the separate server.
# Run on servers where you want to catch traffic. You can run it on each `web` machine.
sudo gor --input-raw :80 --output-tcp replay.local:28020
# Replay server (replay.local).
gor --input-tcp replay.local:28020 --output-http http://staging.com
If you have multiple replay machines you can split traffic among them using –split-output option: it will equally split all incoming traffic to all outputs using round robin algorithm.
gor --input-raw :80 --split-output --output-tcp replay1.local:28020 --output-tcp replay2.local:28020
In case if you are planning a large load testing, you may consider use separate master instance which will control Gor slaves which actually replay traffic. For example:
# This command will read multiple log files, replay them on 10x speed and loop them if needed for 30 seconds, and will distributed traffic (tcp session aware) among multiple workers
gor --input-file logs_from_multiple_machines.*|1000% --input-file-loop --exit-after 30s --recognize-tcp-sessions --split-output --output-tcp worker1.local --output-tcp worker2.local:27017 --output-tcp worker3.local:27017 ... --output-tcp workerN.local:27017
# worker
gor --input-tcp :27017 --ouput-http load_test.target
附录:go环境构建
goreplay是用go语言编写,如果自己手动编译goreplay的二进制文件,需要go语言环境。下面是go环境构建的步骤。
如果是使用官方编译好的二进制文件则不需要运行环境,如本案例。
下载linux版本的go
目前官方最新版本为:go1.8.1,下载文件go1.8.1.linux-amd64.tar.gz
解压
tar -C /usr/local -xzf go1.8.1.linux-amd64.tar.gz
配置环境变量
export PATH=$PATH:/usr/local/go/bin
更多详细信息,请访问go官方网站 https://golang.org/