Post on 12-Apr-2017
Zhen Ming Jiang, Ahmed E. HassanSoftware Analysis and Intelligence (SAIL) Lab Queens University, CanadaGilbert Hamann, Parminder Flora Enterprise Performance Engineering, Research In Motion (RIM), CanadaAbstracting Execution Logs to Execution Events for Enterprise Applications
How many types of errors are there?One RIM application generates 1.6 million log lines (in 8 hours) and 23,000 lines contain fail or failure
Total 319 execution events, among them 16 contains fail or failure
EventsFrequencyError occurred during purchasing, item=$v 500Error! Cannot retrieve catalogs for user=$v300Authentication error for user=$v100
User checkout for accountID(Tom), item=100User checkout for accountID(Jenny), item=100Item shipped for accountID(Tom), item=100User checkout for accountID(John), item=100
Abstracting Log Lines to Execution Events
EventsLinesUser checkout for accountID($v), item=$v1, 2, 4Item shipped for accountID($v), item=$v3
Clone Detection Approach- Parameterized Token Matching
Running CCFinder on LogsWont work for large filesUnsatisfying resultsBecause log lines do not have
Delimiters like ; or }Keywords like if, for
Working ExampleStart check outPaid for, item=bag, quantity=1, amount=100Paid for, item=book, quantity=3, amount=150Check out, total amount is 250Check out done
Our Log Abstraction Approach
3_0_11. Start check out3_0_25. Check out done5_1_14. Check out, total amount=$v8_3_12. Paid for, item=$v, quantity=$v, amount=$v8_3_13. Paid for, item=$v, quantity=$v, amount=$v
Execution Logs
Anonymize
Tokenize
Anonymized Execution Logs
Reconcile
Execution Events DB
Abstracted Log Lines
Categorize
Bins
AnonymizeStart check outPaid for, item=bag, quantity=1, amount=100Paid for, item=book, quantity=3, amount=150Check out, total amount is 250Check out done
Start check outPaid for, item=$v, quantity=$v, amount=$vPaid for, item=$v, quantity=$v, amount=$vCheck out, total amount=$vCheck out done
Execution Logs
Anonymize
Tokenize
Anonymized Execution Logs
Reconcile
Execution Events DB
Abstracted Log Lines
Categorize
Bins
TokenizeStart check outPaid for, item=$v, quantity=$v, amount=$vPaid for, item=$v, quantity=$v, amount=$vCheck out, total amount=$vCheck out done
(3, 0)1. Start check out5. Check out done(5, 1)4. Check out, total amount=$v(8, 3)2. Paid for, item=$v, quantity=$v, amount=$v3. Paid for, item=$v, quantity=$v, amount=$v
Execution Logs
Anonymize
Tokenize
Anonymized Execution Logs
Reconcile
Execution Events DB
Abstracted Log Lines
Categorize
Bins
Categorize
3_0_11. Start check out3_0_25. Check out done5_1_14. Check out, total amount=$v8_3_12. Paid for, item=$v, quantity=$v, amount=$v8_3_12. Paid for, item=$v, quantity=$v, amount=$v
(3, 0)1. Start check out5. Check out done(5, 1)4. Check out, total amount=$v(8, 3)2. Paid for, item=$v, quantity=$v, amount=$v3. Paid for, item=$v, quantity=$v, amount=$v
Execution Logs
Anonymize
Tokenize
Anonymized Execution Logs
Reconcile
Execution Events DB
Abstracted Log Lines
Categorize
Bins
Reconcile
5_0_1Start processing for user Jen5_0_2Start processing for user Tom5_0_3Start processing for user Henry5_0_4Start processing for user Jack5_0_5Start processing for user Peter
5_0_1Start processing for user $v
Execution Logs
Anonymize
Tokenize
Anonymized Execution Logs
Reconcile
Execution Events DB
Abstracted Log Lines
Categorize
Bins
Reconcile
(6, 2)User shopping basket contains: 1, 2(7, 3)User shopping basket contains: 1, 2, 3(8, 4)User shopping basket contains: 1, 2, 3, 4
6_2_1User shopping basket contains: $v
6_2_1User shopping basket contains: $v7_3_1User shopping basket contains: $v8_4_1User shopping basket contains: $v
Execution Logs
Anonymize
Tokenize
Anonymized Execution Logs
Reconcile
Execution Events DB
Abstracted Log Lines
Categorize
Bins
Measuring the Performance
Measuring the Performance- Getting the Correct Execution EventsSimply searching for printf or System.out wont work
We use
Internationalization fileRandom sampling
Case Study4 Applications
Other similar log abstraction tools
TerrifySLCT
RIM App 1723, 608RIM App 21, 688, 876LoadSim67, 651Blue Gene/L2, 994, 986
SLCTUses Frequent Itemset Mining
Performance Comparison
Discussion- SLCT PerformanceSLCT performance is not high, because
Infrequent log lines wont abstractDoes not further abstract line patterns
Discussion- Our heuristicsAdjusting our heuristics
Anonymization rules
Reconcile step
Conclusions
How many types of errors are there?
EventsFrequency
Error occurred during purchasing, item=$v 500
Error! Cannot retrieve catalogs for user=$v300
Authentication error for user=$v100
Our Log Abstraction Approach
3_0_11. Start check out
3_0_25. Check out done
5_1_14. Check out, total amount=$v
8_3_12. Paid for, item=$v, quantity=$v, amount=$v
8_3_12. Paid for, item=$v, quantity=$v, amount=$v
Execution Logs
Anonymize
Tokenize
Anonymized Execution Logs
Reconcile
Execution Events DB
Abstracted Log Lines
Categorize
Bins
Measuring the Performance
Performance Comparison