Okay, we've done our analysis, it's either quadratic or cubic so it's polynomial.

Is that okay?

Who knows?

So the asymptotic analysis doesn't give us stopwatch times for

how long our algorithm will take.

But it does tell us a bit of feeling for

how much longer a algorithm will take as we increase our data size.

So we know that for a quadratic algorithm every time we double the size of our graph

we're going to roughly quadruple the amount of time our algorithm needs to run.

And if our algorithm was running in cubic time, then every time we double the size

of our data, our algorithm run time would perhaps increase by eight fold time.

So that's interesting, and so maybe now running our algorithm on some test cases

will give us some sense of how long our algorithm will run for

our actual data with the 14,000 or 13,000 vertices.

But something that's also really useful is, especially when we're finding our

algorithms as a result of research papers that other people have done,

is to see how big a data set did their implementations run on.

And how well calibrated would

their algorithms in their estimation be for the data that we've used.

And so, for example, for this particular algorithm, the research literature that

we've referenced before suggests that this algorithm runs pretty well when we've got

data of the size that we're working with here.

So for the UCSD data, we should be just fine.

So that analysis leads us to be cautiously optimistic.

We seem like we should go ahead and devote that time and energy to implementation.

But it's still worthwhile to anticipate where things might go awry,

where we might have some obstacles.

And so it's useful as we're implementing to think ahead to what

the roadblocks might be.

And so in each of these three situations that we have talked about so far,

these potential paths that you might take for your project.

There are some clear potential pitfalls that we have.

So for example, for the cascading simulation that Christine presented

there's a question whether a graph will really ever come to equilibrium.

If we're thinking about this phenomenon of the preferences and

the graph being propagated.

Maybe the graph structure and

the preferences are set up that will never get to equilibrium and if we try to run

our algorithm until we get to that equilibrium spot, we'll never stop.

Because maybe we switch between two modes of equilibria or

maybe some other phenomenon happens.

And so, we shouldn't make the assumption that we'll get to an equilibrium.

With Leo's example with looking for that minimum number of vertices that are able

to broadcast something to an entire Twitter network.

He already brought up the point that this problem can be thought of in graph

theoretic terms as minimum dominating set, which is an NP complete problem.

And so we have to worry about any algorithms that we'll run for that for

solving that problem.

We know that NP complete problems, as far as we know,

don't have really efficient solutions for them.

So anything that we're going to do will either run really slowly,

which maybe won't work well for our real world data.

Or will be an approximation, in which case are we okay with just an approximation?

What sort of errors are we allowed and what are we willing to put up with?

And for that community building or

community detection problem that we've been talking about.

We want to think about how our communities that we detect in the graph

really relate to ground truth in our data.

Because what we're really looking at in looking for

these communities is something graph theoretic and structural about the data.

And do we know that sociologically that will

actually translate to communities in the real world?

Maybe, maybe not.

Maybe we need to do some more analysis there.

And so these sorts of pitfalls are good to keep in mind and it's useful to anticipate

those concerns about whether the algorithm will be correct,

whether it will actually find those communities.

Or whether it will actually correspond to the formal definitions that we have.

So we can have correctness both in terms of whether our model is correct and

then also whether our algorithm meets our model.

Then we can ask questions about practicality.

Will our algorithm work with the data size that we have or

will our approximation be good enough?

And then we have questions also about termination.

Now even if we think that our algorithm is great, and even if we're aware and

cognizant of these potential risks.

And we think through them very carefully as we're analyzing.

We also have more concrete

risks that we want to think about when we're budgeting our time for a project.

Because we know that part of the project time will be spent designing the classes.

So thinking about the object-oriented design of the classes and

the inherent structure in the interfaces that you will be writing and

implementing, thinking back to Course Three in the specialization.

And even course two, the inheritance structures that we talked about, and

as you've set up, which methods and fields are in each of the objects and

the classes that you design.

And so we need to allocate some time for that design process to happen.

We also want to spend some time thinking about what people have done already.

We don't need to reinvent the wheel.

Can we use some library functions?

Can we research some documentation to see if we

can bring in functionality from the outside?

And really help our project go further, by using what other people have done.

And then we want to think about data structures, and the specific low level

algorithms that we'll be using to optimize the performance of our algorithm.

If we're looking up information, should we be using a hash set or

a hash map instead of a list?

If we're continually changing the size of our data structure,

should we be using a linked list instead of an array list?

We want to think through these potential bottlenecks and

the issues that will come up as we're implementing the process.

And at each point we want to both budget our time and

also foreshadow the work that's going to come.

So that we can drive ourselves in the correct direction.