Finding the Diameter of a Tree

https://www.acmicpc.net/problem/1967

The definition of the diameter is simple.

The diameter of a tree is the maximum distance between one node to another node.

We say that the diameter of a circle is the longest distance between one point to another in a circle, so two ideas are kind of similar.

I tried several approaches and I found there is a simple method to find it.

First, these are the solutions I tried.

Picking the maximum value of all distances between one node to another
- This is the simplest way and with DFS, we can just get all distances between two arbitrary nodes, $i$ and $j$. But it takes long as the time complexity is bounded by $O(n^2)$.
Extracting all leaf nodes and finding maximum distances from one leaf to other
- To become the diameter, two nodes should be separated as far as possible, so it is obvious that these two nodes are both leaf nodes according to the structure of a tree. So I filtered leaves only and executed DFS, but the running time still exceeded. The time complexity is still $O(n^2)$.
Based on method 2, making the function not search outside of the common parent node of two leaf nodes
- This can reduce the time a little bit, but it does not change $O(n^2)$ time complexity. And finding a common parent is also quite challenging work.

In conclusion, this is an optimal solution.

Finding a tree’s diameter consists of three steps.

Select an arbitrary node $x$ and find the node $y$ which is the farthest node from $x$.
Find the node $t$ which is located in the farthest spot from the location of $y$.
Calculate the distance between $y$ and $t$.

It turns out that step 2 and 3 can be conducted at the same time.

In other words, we can get an answer by executing DFS two times.

So the time complexity becomes $O(n)$.

It is not hard to apply the above algorithm, but the problem is why that method gets the diameter correctly.

This is how to prove it.

If there are node $u$ and $v$ and the distance between them is the diameter, then after picking one arbitrary node $x$ and finding the farthest node $y$, $y$ becomes $u$ or $v$.

1. The case that x is $u$ or $v$

This case obviously proves above statement because the farthest node from $x$, which is $y$, would be $u$ or $v$.

2. The case that y is $u$ or $v$

Same as 1.

3. The case that $x$ and y* are different from $u$ and $v$

This case can be divided into 2 specific cases.

The case that $x$-$y$ does not meet $u$-$v$ As we can see from the image, we can think of the path $t$-$r$ by picking one point from each path. By making this kind of new path, $d(x, y)$ becomes bigger than $d(x, v)$. In other words, it is a contradiction to the assumption that $y$ is the farthest node from $x$.
The case that $x$-$y$ meets $u$-$v$

In this case, two paths share at least one node. Let’s say that they share only one spot. Because $d(x, y)$ is the distance between $x$ and the farthest node $y$, $d(x, y)>d(x, v)$. It leads to $d(t, y)>d(t, v)$. Then $d(u, y) = d(u, t) + d(t, y) >d(u, t) + d(t, v) = d(u, v)$, so there is a longer path than $u$-$v$, which is a contradiction to the assumption that $u$-$v$ is the diameter. So this case cannot exist and the same when there are multiple points shared.

In conclusion, case 3 cannot be possible so there are only two cases 1 and 2.

So the above algorithm can get the correct diameter.

I built the codes assuming that the arbitrary node $x$ is the root node.

Here is the final example.

#include <iostream>
#include <vector>
#include <stack>

using namespace std;

int n;
vector<pair<int, int>> graph[10001];

pair<int, int> findMax(int start) {
    bool checked[n+1] = {false, };
    stack<pair<int, int>> s;
    s.push({start, 0});
    checked[start] = true;

    pair<int, int> result (1, 0);

    while(1) {
        if (s.empty()) {
            break;
        }

        pair<int, int> cur = s.top();
        s.pop();

        if (cur.second >= result.second) {
            result.second = cur.second;
            result.first = cur.first;
        }

        for (int i=0; i<graph[cur.first].size(); ++i) {
            if (!checked[graph[cur.first][i].first]) {
                s.push({graph[cur.first][i].first, cur.second + graph[cur.first][i].second});
                checked[graph[cur.first][i].first] = true;
            }
        }
    }

    return result;
}

int main() {

    scanf("%d", &n);

    for (int i=0; i<n-1; ++i) {
        int a, b, w;
        scanf("%d %d %d", &a, &b, &w);

        graph[a].push_back({b, w});
        graph[b].push_back({a, w});
    }

    int max_node = findMax(1).first;
    int max_len = findMax(max_node).second;
    
    printf("%d\n", max_len);

    return 0;
}    

Categories