Untitled
unknown
plain_text
2 years ago
8.9 kB
4
Indexable
[Problem Description] There are data shown as points on a two dimensional plane consisting of x and y axes. There are no data that are at the same location. All data are at different locations. Each of the data has a category. The category is an integer that is greater than or equals to 1 but less than or equals to 10. The color of the point in [Fig. 1] means the category of each of the data. The following defines dist(A, B) which represents the distance between points A(xA, yA) and B(xB, yB) on the plane. dist(A, B) = |xA – xB| + |yA – yB| Point P(xp, yp) means the point of which the x-axis and y-axis positions are xp and yp respectively. |a| means the absolute value of a. For example, the distance between points (2, 3) and (6, 1) is |2 – 6| + |3 – 1| = 6. The K-nearest neighbors of a certain point means a total of K points which are at the nearest locations from the certain point. In the case of the same distance, smaller x-axis position is higher in priority, and in the case of the same x-axis position, smaller y-axis position is higher in priority. [Fig. 2] represents the priorities of nearest neighbors for point (4, 4). In the figure, a star-shaped point represents point (4, 4) and the numbers inside circles near each point mean priorities. In [Fig. 2], the 1-nearest neighbor of point (4, 4) is the point of which the priority is ①, and the 3-nearest neighbors are points of which the priorities are ①, ②, and ③. The 5-nearest neighbors are points with the priorities of ①, ②, ③, ④, and ⑤. As for a datum of a certain location with the category you do not know, you can assume the category with the K-nearest neighbors. Let’s see how to assume the category of the datum. The most-frequently-occurring category among the categories of the K-nearest neighbors of the datum is assumed to represent the category of the datum. If there is a tie for the most frequent, choose the one with the smallest value. As for a datum represented by point (4, 4) of [Fig. 2], If you make assumptions with 1-nearest neighbor, assume the red as the category because there is only one category which is the red. If you make assumptions with 3-nearest neighbors, assume the red as the category because there are two reds and one blue. If you make assumptions with 5-nearest neighbors, assume the blue as the category because there are two reds and three blues. If you make assumptions with 4-nearest neighbors, assume the one with the smallest value as the category among the red and the blue because there are two reds and two blues. Let’s say that the red is 1 and the blue is 2. Then, assume the red as the category of the datum. When assuming the category of a certain datum, if there is a point whose distance exceeds L among the K-nearest neighbors, that datum is considered the outlier. Let’s say that L = 4, and with 7-nearest neighbors, you are assuming the category of the datum at point (4, 4) of [Fig. 2]. Since the distance of the point of which the priority is ⑦ exceeds L, the datum of point (4, 4) is determined as the outlier. Write this kind of K-nearest neighbors program. Implement each required function by referring to the following API description. ※ The function signature below is based on C/C++. As for other languages, refer to the provided Main and User Code. The following is the description of API to be written in the User Code. void init(int K, int L) Initialization function for test case. It is called once in the beginning of each test case. K represents the number of nearest neighbors. K = 3 means 3-nearest neighbors. L is the length for determining the outlier. There are no data at the beginning. Parameters K : the number of nearest neighbors (3 ≤ K ≤ 11) L : the length for determining the outlier (4 ≤ L ≤ 100) void addSample(int mID, int mX, int mY, int mC) Add a datum with the ID of mID, x-axis position of mX, y-axis position of mY, and category of mC. It is guaranteed that mID passed when calling the function is not a value that was passed already by previous calls. When the function is called, there is no datum at point (mX, mY). Parameters mID : the ID of the datum to add (1 ≤ mID ≤ 1,000,000,000) mX : the x-axis position of the datum (1 ≤ mX ≤ 4,000) mY : the y-axis position of the datum (1 ≤ mY ≤ 4,000) mC : the category of the datum (1 ≤ mC ≤ 10) void deleteSample(int mID) Delete the datum with the ID of mID. When the datum is deleted, the point for datum mID on the plane disappears. It is guaranteed that datum mID is a datum added by calling the previous function addSample() and is not deleted yet. Parameters mID : the ID of the datum to delete (1 ≤ mID ≤ 1,000,000,000) int predict(int mX, int mY) When you suppose a certain datum is at point (mX, mY), assume the category of the datum with K-nearest neighbors, and return the assumed category value. If it is an outlier, return -1. Refer to the above description for assuming categories with K-nearest neighbors and determining outliers. When the function is called, the number of not deleted data is greater than or equals to K, and there is no datum at point (mX, mY). Parameters mX : x-axis position (1 ≤ mX ≤ 4,000) mY : y-axis position (1 ≤ mY ≤ 4,000) Returns the assumed category value if there is a datum at point (mX, mY) -1 in the case of outlier [Example] Let’s think about the case where functions are called as in [Table 1]. Order Function Return Figure 1 init(3, 4) 2 addSample(1, 5, 3, 2) 3 addSample(2, 3, 5, 1) 4 addSample(3, 5, 5, 2) 5 predict(4, 4) 2 2 6 addSample(4, 4, 6, 1) 7 predict(4, 4) 1 3 8 addSample(5, 4, 2, 2) 9 predict(4, 4) 1 4 10 predict(7, 1) -1 5 11 addSample(6, 6, 3, 1) 12 addSample(7, 8, 5, 2) 13 predict(7, 1) 2 6 14 deleteSample(1) 15 predict(7, 1) -1 7 [Table 2] shows that Order 5 found the 3-nearest neighbors for point (4, 4) when predict() was called. Data with the ID of 1, 2, and 3 are 3-nearest neighbors, and category 2 is the most-frequently-occurring category. Therefore, the determined category is 2, so return 2. ID x y category Distance Prority 1 5 3 2 2 2 2 3 5 1 2 1 3 5 5 2 2 3 [Table 3] shows that Order 7 found the 3-nearest neighbors for point (4, 4) when predict() was called. Data with the ID of 1, 2, and 4 are 3-nearest neighbors, and category 1 is the predominant category. Therefore, the determined category is 1, so return 1. ID x y category Distance Prority 1 5 3 2 2 3 2 3 5 1 2 1 3 5 5 2 2 4 4 6 1 2 2 [Table 4] shows that Order 9 found the 3-nearest neighbors for point (4, 4) when predict() was called. Data with the ID of 2, 4, and 5 are 3-nearest neighbors, and category 1 is the most common category. Therefore the determined category is 1, so return 1. ID x y category Distance Prority 1 5 3 2 2 2 3 5 1 2 1 3 5 5 2 2 4 4 6 1 2 3 5 4 3 2 2 2 [Table 5] shows that Order 10 found the 3-nearest neighbors for point (7, 1) when predict() was called. Data with the ID of 1, 3, and 5 are 3-nearest neighbors, and a point exceeds distance 4. Therefore, it is determined as an outlier, so return -1. ID x y category Distance Prority 1 5 3 2 4 2 2 3 5 1 8 3 5 5 2 6 3 4 4 6 1 8 5 4 2 2 4 1 [Table 6] shows that Order 13 found the 3-nearest neighbors for point (7, 1) when predict() was called. Data with the ID of 1, 5, and 6 are 3-nearest neighbors, and category 2 is the most-frequently-occurring category. Therefore the determined category is 2, so return 2. ID x y category Distance Prority 1 5 3 2 4 3 2 3 5 1 8 3 5 5 2 6 4 4 6 1 8 5 4 2 2 6 2 6 6 3 1 3 1 7 8 5 2 5 [Table 7] shows that Order 15 found the 3-nearest neighbors for point (7, 1) when predict() was called. Data with the ID of 5, 6, and 7 are 3-nearest neighbors, and a point exceeds distance 4. Therefore, it is determined as an outlier, so return -1. ID x y category Distance Prority 2 3 5 1 8 3 5 5 2 6 4 4 6 1 8 5 4 2 2 4 2 6 6 3 1 3 1 7 8 5 2 5 3 [Constraints] 1. init() is called in the beginning of each test case. 2. For each case, addSample() is called up to 20,000 times. 3. For each case, deleteSample() is called up to 1,000 times. 4. For each case, predict() is called up to 10,000 times. 5. The maximum number of data with an arbitrary location in a 100 * 100 square is 500. [Input and Output] As the input and output are processed in the provided code in the Main, they are not processed separately in the User Code. The output result for the sample input is in the format of “#TC number result.” It is the correct answer if the result is 100; it is the wrong answer if it is 0.
Editor is loading...
Leave a Comment