http://paiza.jp/poh/ec-campaign/result/3c7e1ae7a3b70790db8da0ef6c287c11
For background and details of issues https://paiza.jp/poh/ec-campaign Please refer to. # paizahack_01
Normally, if you try to find the optimal solution with a combination of all products, it will be $ O (DN ^ 2) $ (this is the model answer created by paiza, so only case1 will pass). To avoid that and make it $ O (DN \ log N) $, I used a binary search like this:
scanf ()
as usualqsort ()
as usualHowever, this code has the embarrassing result that case3 times out. In a hurry, I created a benchmark input equivalent to case 3 and started tuning so that I could quantify the performance at hand.
Lightly profiling takes too long to explore. Since it is as large as $ D = 300 $, there is no help for it unless it is less than 0.01 seconds per search. Therefore, the search algorithm was drastically changed.
Since we want to optimize the sum of the prices of the two products, we only need to make one pair from each side to search. If the total exceeds the campaign price, try changing the higher product to the cheaper product. If it is below the campaign price, if you want to improve the provisional solution so far, update it and try changing the cheaper product to a higher product. If it matches the campaign price, it is the final solution, and if there is no solution that matches to the end, the provisional solution at that time is the final solution.
As an initial value to search, the cheaper one should start with the lowest priced item, while the higher one should start with the campaign price minus the lowest priced item price. This position can be found by the binary search that was also in the original algorithm, and some improvement is possible.
With this policy, $ O (DN) $ is sufficient for the search part. In fact, the search time for benchmarks has dropped dramatically, and all 300 can now be executed in less than 0.01 seconds. However, it takes a little less than 0.30 seconds as a whole.
The next thing I did was sort. qsort ()
is an overhead because it calls the comparison function many times internally. To solve this, I made my own non-recursive merge sort. However, it takes just over 0.20 seconds.
Suddenly, I realized that there were at most 1000000 types of product prices, so I decided to use a bucket sort of $ O (N) $. When this was implemented, it took just over 0.10 seconds.
However, maybe it is necessary to sort in the first place, even if you search with the input value in each bucket of bucket sort, the number of products and the number of buckets are different, so it should not affect the search time so much, so implement it Did. As a side effect, it is no longer necessary to binary search for the higher starting price. This improvement doesn't change the order, but it's now less than 0.10 seconds.
With the improvements so far, the sort time is zero (just put in a bucket), and the search itself is within 0.01 seconds, which remains the input processing.
First, I stopped scanf ()
and replaced it with fgets ()
and strtol ()
. This is about 0.04 seconds. The scanf ()
was heavy just for the "% d "
.
Next, in order to completely eliminate the overhead of function call, instead of processing each line, prepare a large buffer that holds the entire file and read the whole with fread ()
, and numerical value in your own input function When I changed it so that it decodes everything, I was finally able to speed up to 0.01 seconds and 0.02 seconds.
It may be possible to improve it a little more by deciding the presence or absence of white space in the input and the line feed code.
I'm ignoring such details as using global variables, core dumping when unexpected input is given, lack of test code, and so on.
After all order is important. It's time for fine tuning after the order can't be improved anymore.
The final code is shown below, but it's much longer.
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#define MIN_PRICE (10)
#define MAX_PRICE (1000000)
#define MIN_LIMIT (10)
#define MAX_LIMIT (1000000)
#define MAX_PRODUCTS (500000)
#define MAX_DAYS (300)
#define MAX_INPUT_BYTES (6+1+3+1+(7+1)*MAX_PRICE+(7+1)*MAX_DAYS)
int pricebucket[MAX_PRICE+1];
int m_j[MAX_DAYS];
int N;
int D;
// read prices from buffer. it is faster than calling strtol many times.
void
readprices(char **p) {
char c;
int price;
int i;
for (i = 0; i < N; i++) {
price = 0;
for (;;) {
c = *(*p)++;
if (isdigit(c)) {
price = price*10 + c - '0';
} else {
if (price >= MIN_PRICE) {
pricebucket[price]++;
break;
}
}
}
}
}
// read limits from buffer. it is faster than calling strtol many times.
void
readlimits(char **p) {
char c;
int limit;
int j;
for (j = 0; j < D; j++) {
limit = 0;
for (;;) {
c = *(*p)++;
if (isdigit(c)) {
limit = limit*10 + c - '0';
} else {
if (limit >= MIN_LIMIT) {
m_j[j] = limit;
break;
}
}
}
}
}
// search prices for the pair of products so that
// their total price is nearest to and less than or equal to the limit.
// return the total price.
int
searchpair(int limit) {
int l = MIN_PRICE;
int r = (limit - l <= MAX_PRICE) ? limit - l : MAX_PRICE;
int currentmax = 0;
int total;
// check if there is a pair of product with prices l and r.
while (l <= r) {
total = l+r;
if (total <= limit && pricebucket[r] > 0) {
if (pricebucket[l] > 0) {
if (total > currentmax && (l < r || pricebucket[l] > 1)) {
// better pair of different products
currentmax = total;
if (currentmax == limit) {
break;
}
}
}
l++;
} else {
r--;
}
}
return currentmax;
}
int
main(void) {
int i, j, p;
char *buf;
int bufsize;
char *dp;
// use 1 fread + strtol's for read numbers.
// they are faster than scanf.
buf = malloc(MAX_INPUT_BYTES+1);
if (!buf) exit(1);
bufsize = fread(buf, 1, MAX_INPUT_BYTES, stdin);
buf[bufsize] = '\0'; // any non-digit character
N = strtol(buf, &dp, 10);
D = strtol(dp, &dp, 10);
// read prices.
// prices are not stored in an array. just use buckets of prices.
readprices(&dp);
// read limits.
readlimits(&dp);
// compute the answer.
for (j = 0; j < D; j++) {
printf("%d\n", searchpair(m_j[j]));
}
return 0;
}
Recommended Posts