Skip to content

Commit a1868c9

Browse files
author
Tyler Burdsall
authored
Merge pull request #4 from iamtheburd/test_performance
Reinstate performance mode
2 parents 198a6e7 + 7f185cb commit a1868c9

File tree

7 files changed

+146
-10
lines changed

7 files changed

+146
-10
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,4 @@ build/*.o
22
combigen
33
combigen.exe
44
combigen.obj
5+
*.txt

README.md

Lines changed: 53 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Basic commands are listed below:
1515
Usage: combigen [options]
1616
-h Displays this help message
1717
18-
-a Generates every possible combination
18+
-a Generates every possible combination, restricted to memory mode.
1919
(Note: this should be used with caution when storing to disk)
2020
2121
-n <index> Generate combination at nth index
@@ -33,6 +33,11 @@ Usage: combigen [options]
3333
3434
-k Display the keys on the first line of output (for .csv)
3535
36+
-p Use performance mode to generate combinations faster at the
37+
expense of higher RAM usage.
38+
(Note: this is only recommended for computers with large amounts
39+
of RAM when generating a large number of random combinations)
40+
3641
-v Display version number
3742
```
3843

@@ -102,6 +107,7 @@ Or you can feed in an input from `stdin`:
102107
$ cat example_data/combinations.json | combigen -n 100 # Find the combination at index 100
103108
```
104109

110+
Alternatively, if you want to manually type in your string, the program will await user input until EOF. For Windows, this is `CTRL+Z`. For Linux/UNIX, this is `CTRL+D`.
105111

106112
### Output
107113

@@ -207,6 +213,51 @@ $ combigen -i example_data/combinations.json -r 5 -t json # Generate 5 random c
207213
$
208214
```
209215

216+
## Using Performance Mode
217+
218+
When generating a large number of combinations, there come a desire to speed up the process. For this case, use the `-p` flag to set combigen to switch to Performance Mode. This will generate all of the combinations at once before outputting them to `stdout`. **Note: this is only recommended for systems with a large amount of RAM when generating incredibly large sets of data**.
219+
220+
This begins to make a difference when the generated sets of data start to become quite large, as opposed to the default Memory Mode. See the results of some tests below for more information.
221+
222+
For now, when generating every possible combination this will be performed in Memory Mode to save RAM space.
223+
224+
### Performance Tests
225+
226+
To visualize the performance differences between Memory Mode and Performance Mode, a small test was performed to illustrate where Performance Mode begins to offer a significant advantage.
227+
228+
#### Testing Parameters
229+
230+
Each iteration of a test would time the amount of time it takes to generate *n* amount of random combinations and write them to disk; 5 times each. Then, for each amount of *n*, the average of these 5 iterations would be recorded and graphed.
231+
232+
The following tests were performed on a Lenovo ThinkPad T460 with the following specs:
233+
234+
* Windows 7 Enterprise
235+
* 256GB SSD w/full disk encryption
236+
* 8GB Ram
237+
* Intel Core i5 - 6300U @ 2.40GHz
238+
239+
The environment was tested with the following:
240+
241+
* Compiled with Visual Studio Developer Tools 2017 with the compile flags listed above
242+
* Git Bash as a shell to utilize the UNIX `time` function
243+
* Each iteration was generated using the command `time ./combigen.exe -i example_data/combinations.json -r "$n" # amount of random combinations > output.txt`
244+
245+
The source code for these shell scripts can be found in the [peformance_tests](performance_tests/) folder.
246+
247+
#### Testing Results
248+
249+
The results from the test were graphed:
250+
251+
![Testing Results](performance_tests/performance-mode-vs-memory-mode-test-results.png)
252+
253+
#### Conclusion
254+
255+
Based on the results above, Performance Mode will only start to offer real benefits when the amount of combinations is quite large. However, this should only be used when the computer can truly handle storing all of these combinations in RAM. Ultimately, it boils down to two factors:
256+
257+
* If you can spare time and don't want to bog down your machine (or the amount of generated combinations is small), stick with the default Memory Mode
258+
* If you have a well-spec'd machine and can sacrifice the RAM when generating a large amount of combinations, choose Performance Mode.
259+
260+
Regardless, a large amount of combinations requires a large amount of disk space, so keep this into account when generating data.
210261

211262
## Third-Party Libraries
212263

@@ -223,4 +274,4 @@ Combigen uses the following open-source libraries:
223274
Pull-requests are always welcome
224275

225276
## License
226-
Licensed under GPLv3, see [LICENSE](https://github.com/iamtheburd/blob/master/LICENSE)
277+
Licensed under GPLv3, see [LICENSE](https://github.com/iamtheburd/combigen/blob/master/LICENSE)
35 KB
Loading

performance_tests/test_memory_mode

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#!/bin/bash
2+
3+
for amount in 100000 500000 1000000 1500000 2000000 2500000 3000000
4+
do
5+
for i in {1..5}
6+
do
7+
echo "$amount - $i";
8+
time ./combigen.exe -i example_data/combinations.json -r "$amount" -k > output.txt
9+
echo
10+
done
11+
done
12+
rm output.txt
13+
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#!/bin/bash
2+
3+
for amount in 100000 500000 1000000 1500000 2000000 2500000 3000000
4+
do
5+
for i in {1..5}
6+
do
7+
echo "$amount - $i";
8+
time ./combigen.exe -i example_data/combinations.json -r "$amount" -k -p > output.txt
9+
echo
10+
done
11+
done
12+
rm output.txt
13+

src/combigen.cpp

Lines changed: 63 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,30 +2,35 @@
22

33
int main(int argc, char* argv[])
44
{
5-
int c;
5+
int c;
6+
bool args_provided = false;
67
generation_args args;
78

8-
while ( (c = getopt(argc, argv, "han:i:t:r:d:kv")) != -1)
9+
10+
while ( (c = getopt(argc, argv, "han:i:t:r:d:kvp")) != -1)
911
{
1012
switch (c)
1113
{
1214
case 'h':
1315
display_help();
1416
exit(0);
1517
case 'a':
18+
args_provided = true;
1619
args.generate_all_combinations = true;
1720
break;
1821
case 'n':
1922
if (optarg)
2023
{
2124
istringstream iss (optarg);
2225
iss >> args.entry_at;
26+
args_provided = true;
2327
}
2428
break;
2529
case 'i':
2630
if (optarg)
2731
{
2832
args.input = optarg;
33+
args_provided = true;
2934
}
3035
break;
3136
case 't':
@@ -48,6 +53,7 @@ int main(int argc, char* argv[])
4853
{
4954
istringstream iss (optarg);
5055
iss >> args.sample_size;
56+
args_provided = true;
5157
}
5258
break;
5359
case 'd':
@@ -57,16 +63,24 @@ int main(int argc, char* argv[])
5763
}
5864
break;
5965
case 'k':
60-
args.display_keys = true;
66+
args_provided = true;
6167
break;
6268
case 'v':
6369
cout << "combigen - v" << COMBIGEN_MAJOR_VERSION << '.' << COMBIGEN_MINOR_VERSION << '.' << COMBIGEN_REVISION_VERSION << '\n';
6470
exit(0);
71+
case 'p':
72+
args.perf_mode = true;
73+
break;
6574
default:
6675
display_help();
6776
exit(-1);
6877
}
6978
}
79+
if (!args_provided)
80+
{
81+
display_help();
82+
exit(0);
83+
}
7084
if (args.input.empty())
7185
{
7286
istreambuf_iterator<char> begin(cin), end;
@@ -78,6 +92,7 @@ int main(int argc, char* argv[])
7892
{
7993
args.pc = parse_file(args.input);
8094
}
95+
8196
try
8297
{
8398
parse_args(args);
@@ -125,7 +140,7 @@ static const void display_help(void)
125140
{
126141
cout << "Usage: combigen [options]" << "\n"
127142
<< " -h Displays this help message" << "\n\n"
128-
<< " -a Generates every possible combination" << "\n"
143+
<< " -a Generates every possible combination, restricted to memory mode." << "\n"
129144
<< " (Note: this should be used with caution when storing to disk)" << "\n\n"
130145
<< " -n <index> Generate combination at nth index" << "\n\n"
131146
<< " -i <input> Take the given .json file as input. Otherwise, input will come" << "\n"
@@ -136,6 +151,10 @@ static const void display_help(void)
136151
<< " the possible set of combinations" << "\n\n"
137152
<< " -d <delimiter> Set the delimiter when displaying combinations (default is ',')" << "\n\n"
138153
<< " -k Display the keys on the first line of output (for .csv)" << "\n\n"
154+
<< " -p Use performance mode to generate combinations faster at the" << "\n"
155+
<< " expense of higher RAM usage." << "\n"
156+
<< " (Note: this is only recommended for computers with large amounts" << "\n"
157+
<< " of RAM when generating a large number of random combinations)" << "\n\n"
139158
<< " -v Display version number" << "\n";
140159
}
141160

@@ -196,6 +215,35 @@ static const void generate_random_samples(const vector<long> &range, const gener
196215
}
197216
}
198217

218+
static const void generate_random_samples_performance_mode( const generation_args &args)
219+
{
220+
const vector<vector<string>> results = lazy_cartesian_product::generate_samples(args.pc.combinations, args.sample_size);
221+
222+
if (!args.display_json)
223+
{
224+
if (args.display_keys)
225+
{
226+
display_csv_keys(args.pc.keys, args.delim);
227+
}
228+
}
229+
else
230+
{
231+
cout << "[\n";
232+
}
233+
for( const vector<string> &row: results)
234+
{
235+
output_result(row, args, true);
236+
if (args.display_json && &row != &results.back())
237+
{
238+
cout << ",";
239+
}
240+
}
241+
if (args.display_json)
242+
{
243+
cout << "]\n";
244+
}
245+
}
246+
199247
static const void output_result(const vector<string> &result, const generation_args &args, const bool &for_optimization)
200248
{
201249
if (!args.display_json)
@@ -261,8 +309,15 @@ static const void parse_args(const generation_args &args)
261309
cerr << "ERROR: Sample size cannot be greater than maximum possible combinations\n";
262310
exit(-1);
263311
}
264-
vector<long> range = lazy_cartesian_product::generate_random_indices(n, max_size);
265-
generate_random_samples(range, args);
312+
if (args.perf_mode)
313+
{
314+
generate_random_samples_performance_mode(args);
315+
}
316+
else
317+
{
318+
vector<long> range = lazy_cartesian_product::generate_random_indices(n, max_size);
319+
generate_random_samples(range, args);
320+
}
266321
exit(0);
267322
}
268323
else
@@ -326,4 +381,5 @@ static const possible_combinations parse_stdin(const string &input)
326381
exit(-1);
327382
}
328383
return pc;
329-
}
384+
}
385+

src/combigen.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
#define COMBIGEN_MAJOR_VERSION 1
99
#define COMBIGEN_MINOR_VERSION 2
10-
#define COMBIGEN_REVISION_VERSION 0
10+
#define COMBIGEN_REVISION_VERSION 2
1111

1212
#include <iostream>
1313
#include <iterator>
@@ -58,12 +58,14 @@ struct generation_args
5858
bool generate_all_combinations = false;
5959
bool display_keys = false;
6060
bool display_json = false;
61+
bool perf_mode = false;
6162
};
6263

6364
static const void display_csv_keys(const vector<string> &keys, const string &delim);
6465
static const void display_help(void);
6566
static const void generate_all(const long &max_size, const generation_args &args);
6667
static const void generate_random_samples(const vector<long> &range, const generation_args &args);
68+
static const void generate_random_samples_performance_mode(const generation_args &args);
6769
static const void output_result(const vector<string> &result, const generation_args &args, const bool &for_optimization);
6870
static const void parse_args(const generation_args &args);
6971
static const possible_combinations parse_file(const string &input);

0 commit comments

Comments
 (0)