diff on cities

Diff on cities

How can we Compare two urban plans.

Tokyo - Boston = ?

or

Boston 1980 - Boston 2021 = ?

what is the quality distance between two cities?

what to compare?

  • quantative?
  • qualitative?
  • radar chart results?
  • simulation results?
  • table lego config?
  • discrete spatial attributes
  • zoning map?
    • 2D? 3D?
  • network topology?
  • quality terms
    • vibrancy?
  • history?
    • the current way of doing comparison
  • phenotype? genotype?
  • Signified and signifier de2011course
    • signifier: the thing
    • signified: the word (english, japanese)

compareablity

how to compare?

sequential data

Squential data = words, lines in a file, music cord Levenshtein distance.

Levenshtein distance may also be referred to as edit distance, although that term may also denote a larger family of distance metrics known collectively as edit distance.

https://en.wikipedia.org/wiki/Levenshtein%5Fdistance

levenshtein1966binary

on graph edit distance

  1. survey gao2010survey
  2. for trees zhang1989simple

TODO image distance

json data

on json diff

cao2016json

https://github.com/sonnyp/JSON8/tree/master/packages/patch

diffjson

types of values

  1. True

    Note: true and false are separate types

  2. False

  3. Null

  4. String Value wrapped by " (qutation marks). To distinguish between other values the "” is always with the data. ex) true vs “true” is different

  5. Number Number as in JavaScript specification. Does not distinguish ints and floats nor the bit length. A numerical value

  6. Array A list of values, which could be empty or can contain any json value.

  7. Object A key value pair primitive. The key must be a string. The order is not taken into account

We define simple values as value types 1. to 6. and meta values 7. and 8. since these can contain any value.

comparison

there are three cases: incomparable(inc), different(diff), maybe(mb), same(s)

\truefalsenullstringnumberarrayobject
truesamediffincincincincinc
falsediffsameincincincincinc
nullincincsameincincincinc
stringincincincmb1incincinc
numberincincincincmb2incinc
arrayincincincincincmb3inc
objectincincincincincincmb4

When the top level value is tagged incomparable, it should be tagged different.

maybe cases

In all cases, the default should be comparing hash values. If the hash value is the same, we identify the two to be identical. The following illustrates the strategy when the two are different.

  • mb1 (String)

    Strings are serial data. This should perform a standard diff algorithm. and show how it is different.

    • TODO should this be line based? or character based?

      let a = json!("hello world.");
      let b = json!("hola world.");
      
      let d = diffjson(&a, &b);
      
      println!("{:?}", &d);
      // {"d": "hello", "add": "hola"}
      

mb2 (number)

Numbers can be any numerical type. This could be a u64, i64, f64.

  • TODO check if this library is detects the right value…

    let a = json!(20);
    let b = json!(10.0);
    
    let d = diffjson(&a, &b);
    
    println!("{:?}", &d);
    // [{"d": 20, "a":10.0}]
    
    

mb3 (array)

The hash of this value is calculated by concatenating each elements hash.

// TODO: this could be done using incremental updates

let mut agg_hash = String::new();

for e in array {
    let h = hash(e); // fn hash(e: serde_json::Value) -> Hash
    let hash_str = h.as_str();
    agg_hash.push(hash_str); // concat
}
return hash(json!(agg_hash));

if the two hashes do not match, we go through and calculate the Levenshtein Distance.

ref: Levenshtein Distances https://en.wikipedia.org/wiki/Levenshtein%5Fdistance

mb4 (object)

JSON specification does not gurantee the order of the keys. Yet in order to have reproduceable hash values, this has to be ordered. Unlike git1, this could be ordered by the current level of the tree, this effectively makes the problem identical to handling arrays.

diff out put convetion

Diff output has three states that indicate line based modification. ‘a’ stands for addition, ‘d’ stands for deletions, ‘c’ for change.

Architecture

Figure 1: three types of houses from Frank Lloyd Wright, plans taken from march2019geometry pp.27

Figure 1: three types of houses from Frank Lloyd Wright, plans taken from march2019geometry pp.27

Figure 2: above dwellings share the same network topology. Figure taken from march2019geometry pp.28

Figure 2: above dwellings share the same network topology. Figure taken from march2019geometry pp.28

from march2019geometry

City / Urban Planning

The radar chart: is to have fixed set of metrix to make other cities compareable.

tags: phd

Bibliography

[alexander1964city] Alexander, A city is not a tree, 1965, 124, (1964).

[de2011course] De Saussure, Course in general linguistics, Columbia University Press (2011).

[levenshtein1966binary] Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, 707-710, in in: Soviet physics doklady, edited by (1966)

[gao2010survey] Gao, Xiao, Tao & Li, A survey of graph edit distance, Pattern Analysis and applications, 13(1), 113-129 (2010).

[zhang1989simple] Zhang & Shasha, Simple fast algorithms for the editing distance between trees and related problems, SIAM journal on computing, 18(6), 1245-1262 (1989).

[cao2016json] Cao, Falleri, Blanc & Zhang, JSON Patch for Turning a Pull REST API into a Push, 435-449, in in: International Conference on Service-Oriented Computing, edited by (2016)

[march2019geometry] March & Steadman, The geometry of environment: an introduction to spatial organization in design, , (2019).


  1. git orders the files and directories, using the relative path from the repo’s root directory ↩︎