背景

Helios 系统要处理的数据量比较大,尤其是查询所有服务一天的评分数据时要返回每日 1440 分钟的所有应用的评分,总计有几十万个数据点,接口有时延迟会达到数秒。本文记录如何利用 Arthas ,将接口从几百几千 ms,优化到几十 ms。

链路:

img

从链路上看,线上获取一整天的数据时大概 300 多 ms,而查询数据库只有 11ms,说明大部分时间都是程序组装数据时消耗的,于是动起了优化代码的念头。

优化过程

温馨提示:代码可以不用看,没有上下文的情况下很难明白函数什么意思。主要看 Arthas Trace 的结果与优化思路。

初始未优化版本

代码

    private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
        HeliosGetScoreResponse response = new HeliosGetScoreResponse();

        List<HeliosScore> heliosScores = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
        if (CollectionUtils.isEmpty(heliosScores)) {
            return response;
        }

        Set<String> dateSet = new HashSet<>();

        Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));
        for (List<HeliosScore> value : groupByAppIdHeliosScores.values()) {
            value.sort(Comparator.comparing(HeliosScore::getTimeFrom));
            HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
            score.setNamespace(value.get(0).getNamespace());
            score.setAppId(value.get(0).getAppId());
            for (HeliosScore heliosScore : value) {
                List<HeliosScore> splitHeliosScores = heliosScore.split();
                for (HeliosScore splitHeliosScore : splitHeliosScores) {
                    if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) < 0) {
                        continue;
                    }
                    if (splitHeliosScore.getTimeFrom().compareTo(request.getEndTime()) > 0) {
                        break;
                    }
                    dateSet.add(DateUtils.yyyyMMddHHmm.formatDate(splitHeliosScore.getTimeFrom()));
                    if (splitHeliosScore.getScores() == null) {
                        splitHeliosScore.setScores("100");
                        log.error("查询时发现数据缺失: {}", heliosScore);
                    }
                    score.add(Math.max(0, Integer.parseInt(splitHeliosScore.getScores())), null);
                }
            }
            response.getValues().add(score);
        }

        response.setDates(new ArrayList<>(dateSet).stream().sorted().collect(Collectors.toList()));
        return response;
    }

Arthas Trace

`---ts=2021-08-17 16:28:00;thread_name=http-nio-8080-exec-10;id=81;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@20864cd1
    `---[4046.398447ms] xxxService.controller.HeliosController:queryScores()
        +---[0.022259ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #147
        +---[0.007132ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #149
        +---[0.006985ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #149
        +---[0.008704ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #149
        +---[19.284658ms] xxxService.service.HeliosService:queryScoresTimeBetween() #149
        +---[0.017468ms] org.apache.commons.collections.CollectionUtils:isEmpty() #150
        +---[0.008054ms] java.util.HashSet:<init>() #154
        +---[0.027591ms] java.util.List:stream() #156
        +---[0.044229ms] java.util.stream.Collectors:groupingBy() #156
        +---[0.155582ms] java.util.stream.Stream:collect() #156
        +---[0.018318ms] java.util.Map:values() #157
        +---[0.019199ms] java.util.Collection:iterator() #157
        +---[min=3.51E-4ms,max=0.014266ms,total=0.125003ms,count=123] java.util.Iterator:hasNext() #157
        +---[min=5.11E-4ms,max=0.010188ms,total=0.145693ms,count=122] java.util.Iterator:next() #157
        +---[min=4.89E-4ms,max=0.045356ms,total=0.321978ms,count=122] java.util.Comparator:comparing() #158
        +---[min=0.003637ms,max=0.033049ms,total=0.928795ms,count=122] java.util.List:sort() #158
        +---[min=5.94E-4ms,max=0.010442ms,total=0.1485ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #159
        +---[min=4.5E-4ms,max=0.010857ms,total=0.12773ms,count=122] java.util.List:get() #160
        +---[min=5.01E-4ms,max=0.007849ms,total=0.123696ms,count=122] xxxService.helios.entity.HeliosScore:getNamespace() #160
        +---[min=6.5E-4ms,max=0.007324ms,total=0.135906ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #160
        +---[min=3.72E-4ms,max=0.010288ms,total=0.086703ms,count=122] java.util.List:get() #161
        +---[min=5.1E-4ms,max=0.00627ms,total=0.103871ms,count=122] xxxService.helios.entity.HeliosScore:getAppId() #161
        +---[min=5.97E-4ms,max=0.006531ms,total=0.126184ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #161
        +---[min=4.45E-4ms,max=0.020198ms,total=0.138299ms,count=122] java.util.List:iterator() #162
        +---[min=3.42E-4ms,max=0.014615ms,total=0.256056ms,count=366] java.util.Iterator:hasNext() #162
        +---[min=3.59E-4ms,max=0.014974ms,total=0.174396ms,count=244] java.util.Iterator:next() #162
        +---[min=0.071035ms,max=0.148132ms,total=19.444179ms,count=244] xxxService.helios.entity.HeliosScore:split() #163
        +---[min=4.06E-4ms,max=0.022364ms,total=0.210152ms,count=244] java.util.List:iterator() #164
        +---[min=3.07E-4ms,max=0.199649ms,total=143.267893ms,count=351604] java.util.Iterator:hasNext() #164
        +---[min=3.25E-4ms,max=24.863976ms,total=177.15363ms,count=351360] java.util.Iterator:next() #164
        +---[min=3.93E-4ms,max=0.096771ms,total=176.843018ms,count=351360] xxxService.helios.entity.HeliosScore:getTimeFrom() #165
        +---[min=4.07E-4ms,max=18.772715ms,total=205.632183ms,count=351360] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #165
        +---[min=3.33E-4ms,max=0.045589ms,total=149.24486ms,count=351360] java.util.Date:compareTo() #165
        +---[min=3.93E-4ms,max=0.032972ms,total=86.466793ms,count=175680] xxxService.helios.entity.HeliosScore:getTimeFrom() #168
        +---[min=4.12E-4ms,max=0.061003ms,total=94.294061ms,count=175680] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #168
        +---[min=3.37E-4ms,max=0.038792ms,total=74.505056ms,count=175680] java.util.Date:compareTo() #168
        +---[min=3.97E-4ms,max=0.036548ms,total=87.693935ms,count=175680] xxxService.helios.entity.HeliosScore:getTimeFrom() #171
     1  +---[min=0.001952ms,max=0.068413ms,total=391.739063ms,count=175680] xxxService.utils.DateUtils$yyyyMMddHHmm:formatDate() #171
        +---[min=4.07E-4ms,max=0.037904ms,total=108.107714ms,count=175680] java.util.Set:add() #171
        +---[min=3.95E-4ms,max=0.031555ms,total=88.173857ms,count=175680] xxxService.helios.entity.HeliosScore:getScores() #172
        +---[min=3.88E-4ms,max=0.033584ms,total=84.689466ms,count=175680] xxxService.helios.entity.HeliosScore:getScores() #176
        +---[min=3.11E-4ms,max=0.038121ms,total=69.708752ms,count=175680] java.lang.Math:max() #176
        +---[min=4.66E-4ms,max=0.03391ms,total=104.476576ms,count=175680] xxxService.model.helios.HeliosGetScoreResponse$Score:add() #176
        +---[min=6.17E-4ms,max=0.01503ms,total=0.159826ms,count=122] xxxService.model.helios.HeliosGetScoreResponse:getValues() #179
        +---[min=6.44E-4ms,max=0.03742ms,total=0.21068ms,count=122] java.util.List:add() #179
        +---[0.108961ms] java.util.ArrayList:<init>() #182
        +---[0.017455ms] java.util.ArrayList:stream() #182
        +---[0.011099ms] java.util.stream.Stream:sorted() #182
        +---[0.013699ms] java.util.stream.Collectors:toList() #182
        +---[0.38178ms] java.util.stream.Stream:collect() #182
        `---[0.004627ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #182

分析

Arthas 显示总共花了 4 秒,但实际上在链路上看大概是 350~450ms 左右。其他多出来的时间是 Arthas 每一次执行统计的消耗,因为方法里的循环比较多。这也告诉我们,不要用 trace 去看循环很多的方法。会对性能有非常严重的影响。

可以看出整个函数有 3 个循环,第一层循环的数量为 appId 的数量约为 140,第二层是查出来的数据条数,一天的数据已经归并了所以这里应该是 1,第三层是时间区间的分钟数,一天的话就是 1440 个。

Trace 中可以看到消耗最多的是封装的一个 SimpleDateFormat.formatDate()

第一次优化

优化方向

  1. 遍历每个时间点的思路改变,把合并过的大对象拆分成一个个小对象直接遍历,改成先合并起来,通过时间点逻辑上遍历。这样会减少创建几十万个对象。
  2. 将时间点集合 Set<String> dateSet 改为 Set<Date>,这样减少反复 formatDate() 的开销。
  3. 优化字符串转数字的过程,减少 Integer.parseInt方法调用,改为用 Map<String, Integer> 提前创建出 0~100 的字符串数字字典。(后来经过 JMH 测试,还是 Integer.parseInt 最快)

代码

private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
        HeliosGetScoreResponse response = new HeliosGetScoreResponse();

        List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
        if (CollectionUtils.isEmpty(heliosScoresRecord)) {
            return response;
        }

        Set<Date> dateSet = new HashSet<>();

        List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord);

        Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));

        for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) {
            HeliosScore heliosScore = scores.get(0);
            HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
            score.setNamespace(heliosScore.getNamespace());
            score.setAppId(heliosScore.getAppId());
            score.setScores(new ArrayList<>());
            response.getValues().add(score);

            List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore);

            // 以 requestTime 为准
            Calendar indexDate = DateUtils.roundDownMinute(request.getStartTime().getTime());
            int index = 0;
            // 如果 timeFrom < requestTime,则增加 timeFrom 到 requestTime
            while (indexDate.getTime().compareTo(heliosScore.getTimeFrom()) > 0) {
                heliosScore.getTimeFrom().setTime(heliosScore.getTimeFrom().getTime() + 60_000);
                index++;
            }

            while (indexDate.getTime().compareTo(request.getEndTime()) <= 0 && indexDate.getTime().compareTo(heliosScore.getTimeTo()) <= 0  && index < scoreIntList.size()) {
                Integer scoreInt = scoreIntList.get(index++);
                score.getScores().add(scoreInt);
                dateSet.add(indexDate.getTime());
                indexDate.add(Calendar.MINUTE, 1);
            }
        }

        response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList()));
        return response;
    }

Arthas Trace

---ts=2021-08-17 14:44:11;thread_name=http-nio-8080-exec-10;id=ab;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@16ea0f22
    `---[6997.005629ms] xxxService.controller.HeliosController:queryScores()
        +---[0.020032ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
        +---[0.007451ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
        +---[min=0.001054ms,max=7.458198ms,total=213.19538ms,count=170754] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
        +---[0.007267ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
        +---[15.255919ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
        +---[0.020045ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
        +---[0.015161ms] java.util.HashSet:<init>() #156
        +---[20.06713ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #158
        +---[0.043042ms] java.util.List:stream() #160
        +---[0.028232ms] java.util.stream.Collectors:groupingBy() #57
        +---[min=0.087087ms,max=1.931641ms,total=2.018728ms,count=2] java.util.stream.Stream:collect() #57
        +---[0.0151ms] java.util.Map:values() #162
        +---[0.019611ms] java.util.Collection:iterator() #57
        +---[min=7.55E-4ms,max=0.015165ms,total=0.201221ms,count=121] java.util.Iterator:hasNext() #57
        +---[min=0.001178ms,max=0.02477ms,total=0.220931ms,count=120] java.util.Iterator:next() #57
        +---[min=8.14E-4ms,max=0.01101ms,total=0.155044ms,count=120] java.util.List:get() #163
        +---[min=0.001049ms,max=0.009425ms,total=0.231297ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #164
        +---[min=0.001167ms,max=0.009721ms,total=0.194502ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #165
        +---[min=0.001222ms,max=0.020409ms,total=0.264791ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
        +---[min=0.001097ms,max=0.006475ms,total=0.169987ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #166
        +---[min=0.00121ms,max=0.007106ms,total=0.207877ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
        +---[min=8.63E-4ms,max=0.008981ms,total=0.176195ms,count=120] java.util.ArrayList:<init>() #167
        +---[min=0.001225ms,max=0.021948ms,total=0.340375ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
        +---[min=0.00112ms,max=0.008984ms,total=0.196212ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #168
        +---[min=7.64E-4ms,max=0.027237ms,total=154.660479ms,count=170753] java.util.List:add() #57
        +---[min=0.028779ms,max=0.237608ms,total=20.049731ms,count=120] xxxService.helios.HeliosHelper:splitScores() #170
        +---[min=0.001178ms,max=0.008102ms,total=0.199087ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #173
        +---[min=6.89E-4ms,max=0.048069ms,total=140.74298ms,count=170040] java.util.Date:getTime() #57
        +---[min=0.004686ms,max=0.03805ms,total=0.775394ms,count=120] xxxService.utils.DateUtils:roundDownMinute() #57
        +---[min=7.84E-4ms,max=7.562581ms,total=162.855553ms,count=170040] java.util.Calendar:getTime() #176
      2 +---[min=9.94E-4ms,max=0.029962ms,total=385.371864ms,count=339960] xxxService.helios.entity.HeliosScore:getTimeFrom() #57
      1 +---[min=7.76E-4ms,max=7.936578ms,total=483.361269ms,count=511428] java.util.Date:compareTo() #57
        +---[min=9.95E-4ms,max=0.077109ms,total=192.749805ms,count=169920] xxxService.helios.entity.HeliosScore:getTimeFrom() #177
        +---[min=6.94E-4ms,max=7.358942ms,total=151.184751ms,count=169920] java.util.Date:setTime() #57
        +---[min=7.67E-4ms,max=0.029244ms,total=152.500401ms,count=170753] java.util.Calendar:getTime() #181
        +---[min=7.65E-4ms,max=0.016336ms,total=151.879643ms,count=170635] java.util.Calendar:getTime() #182
        +---[min=0.001011ms,max=0.028133ms,total=196.192946ms,count=170635] xxxService.helios.entity.HeliosScore:getTimeTo() #57
        +---[min=6.93E-4ms,max=0.836104ms,total=141.443001ms,count=170635] java.util.List:size() #57
        +---[min=7.63E-4ms,max=7.940119ms,total=162.285955ms,count=170633] java.util.List:get() #183
      3 +---[min=0.001068ms,max=0.973964ms,total=209.721ms,count=170633] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #184
        +---[min=7.71E-4ms,max=0.028856ms,total=154.918574ms,count=170633] java.util.Calendar:getTime() #185
        +---[min=8.07E-4ms,max=8.030316ms,total=186.971072ms,count=170633] java.util.Set:add() #57
        +---[min=7.82E-4ms,max=0.034732ms,total=156.2645ms,count=170633] java.util.Calendar:add() #186
        +---[0.050615ms] java.util.ArrayList:<init>() #190
        +---[0.019114ms] java.util.ArrayList:stream() #57
        +---[0.029096ms] java.util.stream.Stream:sorted() #57
        +---[0.018823ms] java.util.stream.Stream:map() #57
        +---[0.009092ms] java.util.stream.Collectors:toList() #57
        `---[0.006768ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57

分析

这一步实际上执行时间优化了 50ms 左右。

从 Trace 中看耗时时间最长的是 Date 的 compareTo,也就是代码中的 if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) < 0)

而比较意外的是从对象中 get 属性居然也是有开销的。

第二次优化

优化方向

结合上一次 Arthas Trace 的结果,在以下几个方向进行优化:

  1. Date 对象的换成 long 型时间戳进行比较
  2. Date 对象反复 getTime、setTime,改为 long 型时间戳 += 60_000 实现,得到结果后只 setTime 一次。
  3. 每次填充数据都往 Set<String> dateSet 放入数据,改为通过标识判断只放入一次。
  4. 存放分数的 ArrayList 在第一次循环之后,可以确认大小,之后循环创建 ArrayList 时直接填入固定的大小,减少内存创建。

代码

    private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
        HeliosGetScoreResponse response = new HeliosGetScoreResponse();

        List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
        if (CollectionUtils.isEmpty(heliosScoresRecord)) {
            return response;
        }

        Set<Date> dateSet = new HashSet<>();
        boolean isDateSetInitial = false;
        int scoreSize = 16;

        List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord);

        Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));

        for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) {
            HeliosScore heliosScore = scores.get(0);
            HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
            score.setNamespace(heliosScore.getNamespace());
            score.setAppId(heliosScore.getAppId());
            score.setScores(new ArrayList<>(scoreSize));
            response.getValues().add(score);

            List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore);

            // 以 requestTime 为准
            long indexDateMills = request.getStartTime().getTime();
            int index = 0;
            // 如果 timeFrom < requestTime,则增加 timeFrom 到 requestTime
            long heliosScoreTimeFromMills = heliosScore.getTimeFrom().getTime();
            while (indexDateMills > heliosScoreTimeFromMills) {
                heliosScoreTimeFromMills += 60_000;
                index++;
            }
            heliosScore.getTimeFrom().setTime(heliosScoreTimeFromMills);

            long requestEndTimeMills = request.getEndTime().getTime();
            long heliosScoreTimeToMills = heliosScore.getTimeTo().getTime();
            // 循环条件为 (当前时间 <= 请求最大时间) && (当前时间 <= 数据最大时间) && (index < 数据条数)
            while (indexDateMills <= requestEndTimeMills && indexDateMills <= heliosScoreTimeToMills && index < scoreIntList.size()) {
                score.getScores().add(scoreIntList.get(index++));
                if (!isDateSetInitial) {
                    dateSet.add(new Date(indexDateMills));
                }
                indexDateMills += 60_000;
            }
            // 性能优化,减少重复放入的次数
            isDateSetInitial = true;
            // 性能优化,初始化足够的 size 减少扩容次数。 x1.1 为了万一数据数量不一致,留出一点 buffer。
            scoreSize = (int) (score.getScores().size() * 1.1);
        }

        response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList()));
        return response;
    }

Arthas Trace

`---ts=2021-08-17 15:20:41;thread_name=http-nio-8080-exec-7;id=aa;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@14be750c
    `---[1411.395123ms] xxxService.controller.HeliosController:queryScores()
        +---[0.016102ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
        +---[0.019084ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
        +---[0.007879ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
        +---[0.006808ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
        +---[27.494178ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
        +---[0.02087ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
        +---[0.007694ms] java.util.HashSet:<init>() #156
        +---[19.990512ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #160
        +---[0.044161ms] java.util.List:stream() #162
        +---[0.025737ms] java.util.stream.Collectors:groupingBy() #57
        +---[min=0.079651ms,max=2.007048ms,total=2.086699ms,count=2] java.util.stream.Stream:collect() #57
        +---[0.018405ms] java.util.Map:values() #164
        +---[0.021408ms] java.util.Collection:iterator() #57
        +---[min=7.4E-4ms,max=0.015625ms,total=0.177657ms,count=121] java.util.Iterator:hasNext() #57
        +---[min=0.001193ms,max=0.026712ms,total=0.258491ms,count=120] java.util.Iterator:next() #57
        +---[min=7.69E-4ms,max=0.011855ms,total=0.158671ms,count=120] java.util.List:get() #165
        +---[min=0.001045ms,max=0.019788ms,total=0.232004ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #166
        +---[min=0.001072ms,max=0.007958ms,total=0.193652ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #167
        +---[min=0.001164ms,max=0.007796ms,total=0.201584ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
        +---[min=0.001048ms,max=0.007456ms,total=0.178323ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #168
        +---[min=0.001137ms,max=0.010225ms,total=0.201887ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
        +---[min=0.001627ms,max=0.010431ms,total=0.291395ms,count=120] java.util.ArrayList:<init>() #169
        +---[min=0.00116ms,max=0.0088ms,total=0.20171ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
        +---[min=0.001076ms,max=0.010293ms,total=0.199407ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #170
        +---[min=7.54E-4ms,max=0.086952ms,total=150.86682ms,count=170753] java.util.List:add() #57
        +---[min=0.020428ms,max=0.269554ms,total=19.477128ms,count=120] xxxService.helios.HeliosHelper:splitScores() #172
        +---[min=0.001092ms,max=0.005258ms,total=0.202045ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #175
        +---[min=7.09E-4ms,max=0.021027ms,total=0.630747ms,count=480] java.util.Date:getTime() #57
        +---[min=0.00106ms,max=0.015055ms,total=0.188439ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #178
        +---[min=0.001025ms,max=0.009712ms,total=0.171506ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #183
        +---[min=7.4E-4ms,max=0.092253ms,total=0.251068ms,count=120] java.util.Date:setTime() #57
        +---[min=0.001086ms,max=0.006234ms,total=0.184256ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #185
        +---[min=0.001036ms,max=0.012332ms,total=0.176491ms,count=120] xxxService.helios.entity.HeliosScore:getTimeTo() #186
      3 +---[min=6.73E-4ms,max=0.066785ms,total=135.009239ms,count=170635] java.util.List:size() #188
      1 +---[min=0.001085ms,max=0.089243ms,total=208.003309ms,count=170633] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #189
      2 +---[min=7.31E-4ms,max=0.070823ms,total=145.488732ms,count=170633] java.util.List:get() #57
        +---[min=0.001177ms,max=0.143546ms,total=2.319379ms,count=1440] java.util.Date:<init>() #191
        +---[min=0.001346ms,max=0.064411ms,total=2.839878ms,count=1440] java.util.Set:add() #57
        +---[min=0.001096ms,max=0.009059ms,total=0.190336ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #198
        +---[min=6.92E-4ms,max=0.016223ms,total=0.141751ms,count=120] java.util.List:size() #57
        +---[0.069753ms] java.util.ArrayList:<init>() #201
        +---[0.021066ms] java.util.ArrayList:stream() #57
        +---[0.029498ms] java.util.stream.Stream:sorted() #57
        +---[0.014089ms] java.util.stream.Stream:map() #57
        +---[0.013053ms] java.util.stream.Collectors:toList() #57
        `---[0.009818ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57

分析

这一步将执行时间又优化了 80ms 左右。现在还剩是 160ms 了。

从 Trace 中看耗时时间最长的是三个方法:

  • getScores。直接 get 了属性啥也没干,但是积少成多
  • list.size()
  • list.get(index)

也就是说虽然这几个函数里也没干什么东西,但是函数调用、指针寻址本身也是有开销的。

第三次优化

优化方向

  1. 减少 list 属性的调用
  2. 一次次 list.add 方法改成 subList 一次性放入

也就是说循环中不做任何耗时操作,不做任何指针/引用。

代码

private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
        HeliosGetScoreResponse response = new HeliosGetScoreResponse();

        List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
        if (CollectionUtils.isEmpty(heliosScoresRecord)) {
            return response;
        }

        Set<Date> dateSet = new HashSet<>();
        boolean isDateSetInitial = false;
        int scoreSize = 16;

        List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord);

        Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));

        for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) {
            HeliosScore heliosScore = scores.get(0);
            HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
            score.setNamespace(heliosScore.getNamespace());
            score.setAppId(heliosScore.getAppId());
            score.setScores(new ArrayList<>(scoreSize));
            response.getValues().add(score);

            List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore);

            // 以 requestTime 为准
            long indexDateMills = request.getStartTime().getTime();
            int index = 0;
            // 如果 timeFrom < requestTime,则增加 timeFrom 到 requestTime
            long heliosScoreTimeFromMills = heliosScore.getTimeFrom().getTime();
            while (indexDateMills > heliosScoreTimeFromMills) {
                heliosScoreTimeFromMills += 60_000;
                index++;
            }
            heliosScore.getTimeFrom().setTime(heliosScoreTimeFromMills);

            long requestEndTimeMills = request.getEndTime().getTime();
            long heliosScoreTimeToMills = heliosScore.getTimeTo().getTime();

            // 循环条件为 (当前时间 <= 请求最大时间) && (当前时间 <= 数据最大时间) && (index < 数据条数)
            int scoreIntListSize = scoreIntList.size();
            int indexStart = index;
            while (indexDateMills <= requestEndTimeMills && indexDateMills <= heliosScoreTimeToMills && index++ < scoreIntListSize) {
                if (!isDateSetInitial) {
                    dateSet.add(new Date(indexDateMills));
                }
                indexDateMills += 60_000;
            }
            score.getScores().addAll(scoreIntList.subList(indexStart, index - 1));
            // 性能优化,减少重复放入的次数
            isDateSetInitial = true;
            // 性能优化,初始化足够的 size 减少扩容次数。 x1.1 为了万一数据数量不一致,留出一点 buffer。
            scoreSize = (int) (score.getScores().size() * 1.1);
        }

        response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList()));
        return response;
    }

Arthas Trace

`---ts=2021-08-17 15:33:40;thread_name=http-nio-8080-exec-11;id=f1;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@d1c5cf2
    `---[138.624811ms] xxxService.controller.HeliosController:queryScores()
        +---[0.021852ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
        +---[0.00746ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
        +---[0.005838ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
        +---[0.006341ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
    2   +---[15.227453ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
        +---[0.02168ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
        +---[0.008923ms] java.util.HashSet:<init>() #156
    1   +---[22.703926ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #160
        +---[0.047118ms] java.util.List:stream() #162
        +---[0.043183ms] java.util.stream.Collectors:groupingBy() #57
        +---[min=0.095654ms,max=2.183288ms,total=2.278942ms,count=2] java.util.stream.Stream:collect() #57
        +---[0.022906ms] java.util.Map:values() #164
        +---[0.025777ms] java.util.Collection:iterator() #57
        +---[min=9.28E-4ms,max=0.017187ms,total=0.261862ms,count=121] java.util.Iterator:hasNext() #57
        +---[min=9.88E-4ms,max=0.018901ms,total=0.280889ms,count=120] java.util.Iterator:next() #57
        +---[min=9.65E-4ms,max=0.014741ms,total=0.262695ms,count=120] java.util.List:get() #165
        +---[min=0.001215ms,max=0.013928ms,total=0.347762ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #166
        +---[min=0.001253ms,max=0.010855ms,total=0.328842ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #167
        +---[min=0.001316ms,max=0.014714ms,total=0.372553ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
        +---[min=0.001211ms,max=0.010511ms,total=0.322723ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #168
        +---[min=0.00132ms,max=0.010201ms,total=0.334627ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
        +---[min=0.00116ms,max=0.014504ms,total=0.386879ms,count=120] java.util.ArrayList:<init>() #169
        +---[min=0.00131ms,max=0.014072ms,total=0.344922ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
        +---[min=0.001261ms,max=0.017312ms,total=0.356444ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #170
        +---[min=9.73E-4ms,max=0.016531ms,total=0.275794ms,count=120] java.util.List:add() #57
     3  +---[min=0.023208ms,max=19.808819ms,total=47.196601ms,count=120] xxxService.helios.HeliosHelper:splitScores() #172
        +---[min=0.001289ms,max=0.009578ms,total=0.36878ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #175
        +---[min=8.85E-4ms,max=0.016405ms,total=0.994157ms,count=480] java.util.Date:getTime() #57
        +---[min=0.001238ms,max=0.016801ms,total=0.34399ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #178
        +---[min=0.001217ms,max=0.008931ms,total=0.316197ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #183
        +---[min=9.14E-4ms,max=0.015929ms,total=0.277078ms,count=120] java.util.Date:setTime() #57
        +---[min=0.001238ms,max=0.01061ms,total=0.3375ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #185
        +---[min=0.001225ms,max=0.018059ms,total=0.315198ms,count=120] xxxService.helios.entity.HeliosScore:getTimeTo() #186
        +---[min=8.79E-4ms,max=0.022669ms,total=0.272356ms,count=120] java.util.List:size() #189
        +---[min=0.002001ms,max=0.056977ms,total=4.32853ms,count=1440] java.util.Date:<init>() #193
        +---[min=0.002174ms,max=0.040594ms,total=4.594415ms,count=1440] java.util.Set:add() #57
        +---[min=0.001302ms,max=0.012925ms,total=0.353165ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #197
        +---[min=0.001004ms,max=0.033424ms,total=0.338294ms,count=120] java.util.List:subList() #57
        +---[min=0.004871ms,max=0.051046ms,total=2.945263ms,count=120] java.util.List:addAll() #57
        +---[min=0.001291ms,max=0.009831ms,total=0.314292ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #201
        +---[min=8.84E-4ms,max=0.018168ms,total=0.249321ms,count=120] java.util.List:size() #57
        +---[0.054305ms] java.util.ArrayList:<init>() #204
        +---[0.024481ms] java.util.ArrayList:stream() #57
        +---[0.028717ms] java.util.stream.Stream:sorted() #57
        +---[0.013725ms] java.util.stream.Stream:map() #57
        +---[0.0128ms] java.util.stream.Collectors:toList() #57
        `---[0.007166ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57

分析

这一步又优化了 100ms 左右,现在还剩 60ms。

现在从 trace 上看耗时操作只有三个了:

  • 查数据库
  • 合并数据
  • 拆分得分字符串 “100,100,100” 为 int 数组 [100,100,100]

第四次优化

优化方向

  1. 查数据库发现由于 SQL 判断不准确,每次会多查出来一条数据,在后边循环的时候会多循环一倍
  2. 合并数据时发现可以针对单条数据的情况直接过滤,减少开销。

代码

  1. 改了 SQL 并验证,减少查询出来的数据量
  2. 单条数据时不再处理合并逻辑

Arthas Trace

`---ts=2021-08-17 16:03:24;thread_name=http-nio-8080-exec-13;id=f1;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@69e2fe3b
    `---[38.171379ms] xxxService.controller.HeliosController:queryScores()
        +---[0.009463ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
        +---[0.00348ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
        +---[0.003233ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
        +---[0.003395ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
     1  +---[10.157226ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
        +---[0.009989ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
        +---[0.003394ms] java.util.HashSet:<init>() #156
        +---[0.083535ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #160
        +---[0.017819ms] java.util.List:stream() #162
        +---[0.011787ms] java.util.stream.Collectors:groupingBy() #57
        +---[min=0.047561ms,max=2.02786ms,total=2.075421ms,count=2] java.util.stream.Stream:collect() #57
        +---[0.015525ms] java.util.Map:values() #164
        +---[0.021965ms] java.util.Collection:iterator() #57
        +---[min=7.25E-4ms,max=0.009733ms,total=0.115783ms,count=121] java.util.Iterator:hasNext() #57
        +---[min=8.43E-4ms,max=0.011422ms,total=0.142771ms,count=120] java.util.Iterator:next() #57
        +---[min=7.81E-4ms,max=0.010883ms,total=0.128809ms,count=120] java.util.List:get() #165
        +---[min=0.001023ms,max=0.004301ms,total=0.150165ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #166
        +---[min=0.001066ms,max=0.004648ms,total=0.154698ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #167
        +---[min=0.001137ms,max=0.005607ms,total=0.170279ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
        +---[min=0.001023ms,max=0.004292ms,total=0.151767ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #168
        +---[min=0.001105ms,max=0.004701ms,total=0.164955ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
        +---[min=0.001359ms,max=0.007931ms,total=0.233665ms,count=120] java.util.ArrayList:<init>() #169
        +---[min=0.001117ms,max=0.00785ms,total=0.168539ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
        +---[min=0.001073ms,max=0.004488ms,total=0.156654ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #170
        +---[min=7.98E-4ms,max=0.00977ms,total=0.125818ms,count=120] java.util.List:add() #57
        +---[min=0.022304ms,max=0.12093ms,total=8.88628ms,count=120] xxxService.helios.HeliosHelper:splitScores() #172
        +---[min=0.001092ms,max=0.004967ms,total=0.161288ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #175
        +---[min=7.02E-4ms,max=0.012136ms,total=0.467786ms,count=480] java.util.Date:getTime() #57
        +---[min=0.001022ms,max=0.004944ms,total=0.151353ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #178
        +---[min=0.001018ms,max=0.004731ms,total=0.148025ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #183
        +---[min=7.3E-4ms,max=0.009359ms,total=0.120588ms,count=120] java.util.Date:setTime() #57
        +---[min=0.00107ms,max=0.008948ms,total=0.162848ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #185
        +---[min=0.001034ms,max=0.014003ms,total=0.158614ms,count=120] xxxService.helios.entity.HeliosScore:getTimeTo() #186
        +---[min=6.99E-4ms,max=0.009995ms,total=0.11179ms,count=120] java.util.List:size() #189
        +---[min=6.95E-4ms,max=0.005468ms,total=1.116308ms,count=1440] java.util.Date:<init>() #193
        +---[min=7.79E-4ms,max=0.029909ms,total=1.407528ms,count=1440] java.util.Set:add() #57
        +---[min=0.001097ms,max=0.008616ms,total=0.160597ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #197
        +---[min=8.23E-4ms,max=0.0294ms,total=0.153353ms,count=120] java.util.List:subList() #57
        +---[min=0.005771ms,max=0.04465ms,total=1.992151ms,count=120] java.util.List:addAll() #57
        +---[min=0.001098ms,max=0.007013ms,total=0.169555ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #201
        +---[min=7.04E-4ms,max=0.01315ms,total=0.120998ms,count=120] java.util.List:size() #57
        +---[0.197732ms] java.util.ArrayList:<init>() #204
        +---[0.018589ms] java.util.ArrayList:stream() #57
        +---[0.025192ms] java.util.stream.Stream:sorted() #57
        +---[0.012544ms] java.util.stream.Stream:map() #57
        +---[0.012188ms] java.util.stream.Collectors:toList() #57
        `---[0.0067ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57

分析

可以看到现在最大耗时的地方终于是数据库查询了。现在查询一整天的数据,也只需要 25~40ms 左右。

结果

链路:

img

链路上看程序代码还是要处理个十几 ms,主要是字符串转 int[] 时的开销,这一步可以再想办法继续优化。

结论

从这次优化我们可以得到一些结论:

  1. 尽量少创建对象
  2. SimpleDateFormat的开销很大
  3. Date.compare 的开销不低
  4. 哪怕最简单的操作如 list.size() list.add次数多了开销也很可观
  5. 对于性能分析和优化一定要有合适工具,才能得出有用的结论并针对性优化。一开始我以为减少对象创建就万事大吉,但实际上性能消耗的大头并不在这里。还是得借助 Arthas 的 Trace 才能真正针对性地优化。

另外强烈推荐 动态追踪技术漫谈

原文地址:https://github.com/alibaba/arthas/issues/1892