November 9, 2018
David Naughton - UMN Libraries Web Development
In the Dark: Excellent investigative journalism podcast by Madeleine Baran. Season One is about what went wrong in the Jacob Wetterling case.
Perhaps I could best describe my experience of doing mathematics in terms of entering a dark mansion. One goes into the first room, and it’s dark, completely dark. One stumbles around bumping into the furniture, and gradually, you learn where each piece of furniture is, and finally, after six months or so, you find the light switch.
— Andrew Wiles
Extended quotation on the "Impossibility of Unfamiliar Optimization When Decision Time Is Scarce" follows.
In the case of an unfamiliar problem, the decision maker must devise a method for finding the alternative to be chosen before it can be applied. This leads to two levels of decision-making activities which both take time:
Level 1: Finding the alternative to be chosen.
Level 2: Finding a method for Level 1.
What is the optimal approach to the problem of Level 2? One can hardly imagine that this problem is familiar. Presumably a decision maker who does not immediately know what to do on Level 1 will also not be familiar with the task of Level 2. Therefore, some time must be spent to find an optimal method for solving the task of Level 2. Thus we arrive at Level 3.
It is clear that in this way we obtain an infinite sequence of levels k = 2, 3 , ... provided that finding an optimal method for Level k continues to be unfamiliar for every k.
Level k: Finding a method for Level k - 1.
Selten, Reinhard. "What Is Bounded Rationality?" Bounded Rationality: The Adaptive Toolbox. Ed. G. Gigerenzer and R. Selten. Cambridge, MA: MIT Press, 2001. 13-36.
The more items you are unfamiliar with, and the more you are unfamiliar with each one, the greater your uncertainty.
More detail and illustrative examples about some of these items later.
Promising a minimum viable product (MVP) in two weeks may work well for a greenfield project, but is probably unrealistic for most integration projects.
Allow time for discovery and understanding of systems to be integrated.
Maybe the most valuable benefit of agile. Well explained in this talk, which also beautifully explains agile vs. waterfall, and how software engineering is different from other kinds of engineering: Real Software Engineering
If Pure allowed updating individual records via an API, rather than requiring uploading entire datasets in Sync, feedback cycles would be dramatically reduced!
Seductive idea: building a single system to do many things will save work, "kill two birds with one stone". In reality, such a system tends to have tight dependencies among its parts, which requires that multiple problems be solved at once. Long feedback cycles!
Break systems down into smaller systems, with as few dependencies as possible, even if it seems like more work.
Rationalists claim that there are significant ways in which our concepts and knowledge are gained independently of sense experience. Empiricists claim that sense experience is the ultimate source of all our concepts and knowledge.
— Stanford Encyclopedia of Philosophy
Less Leibniz...
...more Hume...
...and Leonardo!
No plan survives contact with the enemy.
— Helmuth von Moltke the Elder
No design survives contact with reality.
{your.pure.domain}/ws/api/{pure-version}/api-docs/index.html#/
Example: experts.umn.edu/ws/api/512/api-docs/index.html#/
Some higher-level overview documentation would make this even better. I also could find no schema for the changes
endpoint.
github.com/UMNLibraries/pureapi
Better documentation and an open source license coming soon.
OIT Data Warehouse containing data from PeopleSoft. Completely unfamiliar.
Affiliate jobs (e.g., adjunct faculty) in a "person of interest" table, with very different columns than the main "jobs" table, and with even worse data entry.
In cases like ours, where we have no unique ID's for people's jobs, Pure automatically assigns its own unique ID's to jobs:
autoid:{organisation-id}{job-title}{employment-type}{start-date}
Did not carefully enough consider effects of a new upload on existing data. One reason I didn't know about these Pure-generated job ID's. (I also don't think these Pure ID schemes are documented.) Even worse, I mistakenly synced to production!
May seem easy, especially for the outgoing. But what's hard is making ourselves vulnerable, handling conflict, admitting failure or lack of knowledge.
The more complex and unfamiliar the project, the more likely the need for difficult communication.
In all the voluminous Pure documentation, how do we find what we need?