Projects 2 & 3: HTTP Server and Data Analysis

Project 2: HTTP Server (16-18 hours)

Build: A REST API server that handles HTTP requests and responds with JSON

Learning Outcomes: - Network programming fundamentals - HTTP protocol handling (requests, responses, headers) - Routing and request dispatching - JSON serialization/deserialization - Stateful server management - Error handling in concurrent scenarios

!HTTP Request/Response Cycle


Step 1: HTTP Request/Response Types

Define the data structures for HTTP communication:

// file: http_types.zbr

// teaches: protocol data structures // project: Project-2-HTTP-Server

class HttpRequest var method as str # GET, POST, PUT, DELETE var path as str # /api/users, /api/users/123 var query as HashMap(str, str) # URL parameters var headers as HashMap(str, str) var body as str # Request body for POST/PUT

class HttpResponse var status_code as int # 200, 201, 400, 404, 500 var status_message as str # "OK", "Created", "Not Found" var headers as HashMap(str, str) var body as str # Response body (JSON, HTML, etc.)

shared def ok(body as str) as HttpResponse var resp = HttpResponse() resp.status_code = 200 resp.status_message = "OK" resp.body = body return resp

def created(body as str) as HttpResponse var resp = HttpResponse() resp.status_code = 201 resp.status_message = "Created" resp.body = body return resp

def bad_request(message as str) as HttpResponse var resp = HttpResponse() resp.status_code = 400 resp.status_message = "Bad Request" resp.body = message return resp

def not_found as HttpResponse var resp = HttpResponse() resp.status_code = 404 resp.status_message = "Not Found" resp.body = "Resource not found" return resp

def internal_error(message as str) as HttpResponse var resp = HttpResponse() resp.status_code = 500 resp.status_message = "Internal Server Error" resp.body = message return resp

def format_response as str var output = "HTTP/1.1 ${status_code} ${status_message}\r\n" output = output.concat("Content-Length: ${body.len}\r\n") output = output.concat("Content-Type: application/json\r\n") output = output.concat("\r\n") output = output.concat(body) return output

class User var id as int var name as str var email as str

def to_json as str return "{\"id\": ${id}, \"name\": \"${name}\", \"email\": \"${email}\"}"


Step 2: Request Routing

Implement the routing system that maps paths to handlers:

// file: router.zbr

// teaches: request routing and dispatching // project: Project-2-HTTP-Server

interface RequestHandler def handle(request as HttpRequest) as HttpResponse

class Router var routes as HashMap(str, RequestHandler) = HashMap()

def register(path as str, handler as RequestHandler) routes.put(path, handler)

def route_request(request as HttpRequest) as Result(HttpResponse, str) if routes.contains(request.path) var handler = routes.fetch(request.path) var response = handler.handle(request) return Result.ok(response)

# Path not found return Result.ok(HttpResponse.not_found())

def list_routes as List(str) var paths as List(str) = List() for path in routes paths.add(path) return paths

class UserHandler implements RequestHandler shared var users as HashMap(int, User) = HashMap() shared var next_id as int = 1

def handle(request as HttpRequest) as HttpResponse if request.method == "GET" return handle_get(request) elif request.method == "POST" return handle_post(request) elif request.method == "PUT" return handle_put(request) elif request.method == "DELETE" return handle_delete(request)

return HttpResponse.bad_request("Method not allowed")

def handle_get(request as HttpRequest) as HttpResponse # If path is /api/users, list all users if request.path == "/api/users" var response_body = "[" var first = true for id, user in users if not first response_body = response_body.concat(",") response_body = response_body.concat(user.to_json()) first = false response_body = response_body.concat("]") return HttpResponse.ok(response_body)

return HttpResponse.not_found()

def handle_post(request as HttpRequest) as HttpResponse # Create new user from JSON body # Simplified: real impl would parse JSON properly var user = User() user.id = next_id user.name = "User ${next_id}" user.email = "user${next_id}@example.com"

users.put(user.id, user) next_id = next_id + 1

return HttpResponse.created(user.to_json())

def handle_put(request as HttpRequest) as HttpResponse return HttpResponse.bad_request("PUT not yet implemented")

def handle_delete(request as HttpRequest) as HttpResponse return HttpResponse.bad_request("DELETE not yet implemented")

class HealthHandler implements RequestHandler def handle(request as HttpRequest) as HttpResponse var response = "{\"status\": \"healthy\"}" return HttpResponse.ok(response)


Step 3: Server Implementation

Build the actual server that listens for connections:

// file: http_server.zbr

// teaches: network server programming // project: Project-2-HTTP-Server

class HttpServer var port as int var router as Router var is_running as bool = false

def init(port as int) this.port = port router = Router()

def register_handler(path as str, handler as RequestHandler) router.register(path, handler)

def start as Result(bool, str) is_running = true

# Register default handlers register_handler("/health", HealthHandler()) register_handler("/api/users", UserHandler())

print "Server starting on port ${port}..." print "Available routes:" var routes = router.list_routes() for route in routes print " ${route}"

# Main server loop # In real implementation, this would: # 1. Create TCP socket listening on port # 2. Accept connections in loop # 3. Parse HTTP request from socket # 4. Route request to handler # 5. Send response back to client # 6. Close connection

# Simplified simulation handle_request_simulation()

return Result.ok(true)

def handle_request_simulation # Simulate receiving a GET /health request var request = HttpRequest() request.method = "GET" request.path = "/health" request.body = "" request.query = HashMap() request.headers = HashMap()

var result = router.route_request(request) if result.isOk() var response = result.okValue() print response.format_response()

class Main shared def main var server = HttpServer(8080)

var result = server.start()

if result.isErr() print "Error: ${result.errValue()}" else print "Server running. (Ctrl+C to stop)"


Exercises

1. Add query parameter parsing: Extract ?name=value&key=value from URLs 2. Support path parameters: /api/users/123 extracting the ID 3. Request logging: Log each request (timestamp, method, path, response code) 4. Middleware system: Add pre/post processing hooks 5. JSON parsing: Parse POST body as JSON to extract fields 6. Status codes: Return appropriate status codes (201 for created, 400 for bad request, etc.)

Testing the Server

<h1>Start the server</h1>

zebra http_server.zbr &

<h1>Test endpoints</h1> curl http://localhost:8080/health curl http://localhost:8080/api/users curl -X POST http://localhost:8080/api/users


Project 3: Text Data Analysis (12-15 hours)

Build: Analyze text data (n-grams, frequencies, similarity) for linguistic patterns

Learning Outcomes: - Advanced data structures (HashMap, nested structures) - Algorithms (n-gram extraction, frequency analysis, similarity scoring) - File batch processing - Statistical analysis - Performance optimization with data structures

!Text Analysis Pipeline


Step 1: Frequency Analysis

Start with counting word frequencies:

// file: frequency_analysis.zbr

// teaches: frequency counting and sorting // project: Project-3-Data-Analysis

class WordFrequency var word as str var count as int

def init(word as str, count as int) this.word = word this.count = count

def to_string as str return "${word}: ${count}"

class FrequencyAnalyzer shared def analyze_text(text as str) as List(WordFrequency) var words = text.lower().split(" ") var freq as HashMap(str, int) = HashMap()

# Count occurrences for word in words var cleaned = word.trim() if cleaned.len > 0 if freq.contains(cleaned) freq.put(cleaned, freq.fetch(cleaned) + 1) else freq.put(cleaned, 1)

# Convert to list and sort by frequency var results as List(WordFrequency) = List() for word, count in freq var wf = WordFrequency(word, count) results.add(wf)

# Simple bubble sort (in-place) var i = 0 while i < results.count() var j = 0 while j < results.count() - 1 var current = results.at(j) var next = results.at(j + 1) if current.count < next.count # Swap (simplified) var temp = current results.at(j) = next results.at(j + 1) = temp j = j + 1 i = i + 1

return results

def top_words(text as str, limit as int) as List(WordFrequency) var all_freqs = analyze_text(text) var results as List(WordFrequency) = List()

var i = 0 while i < limit and i < all_freqs.count() results.add(all_freqs.at(i)) i = i + 1

return results


Step 2: N-gram Extraction

Extract contiguous sequences of N words:

// file: ngram_analysis.zbr

// teaches: n-gram extraction and pattern detection // project: Project-3-Data-Analysis

class NGram var gram as str var count as int var positions as List(int) # Track where it appears

def init(gram as str) this.gram = gram count = 1 positions = List()

class NGramAnalyzer shared def extract_ngrams(text as str, n as int) as HashMap(str, NGram) var words = text.lower().split(" ") var ngrams as HashMap(str, NGram) = HashMap()

var i = 0 while i < words.count() - (n - 1) var gram = "" var j = 0 while j < n var word = words.at(i + j).trim() if j > 0 gram = gram.concat(" ") gram = gram.concat(word) j = j + 1

if ngrams.contains(gram) var ng = ngrams.fetch(gram) ng.count = ng.count + 1 ng.positions.add(i) else var ng = NGram(gram) ng.positions.add(i) ngrams.put(gram, ng)

i = i + 1

return ngrams

def top_ngrams(text as str, n as int, limit as int) as List(NGram) var all_grams = extract_ngrams(text, n) var results as List(NGram) = List()

# Simple sorting for gram, ng in all_grams results.add(ng)

# Bubble sort by count var i = 0 while i < results.count() var j = 0 while j < results.count() - 1 var current = results.at(j) var next = results.at(j + 1) if current.count < next.count var temp = current results.at(j) = next results.at(j + 1) = temp j = j + 1 i = i + 1

# Return top N var top as List(NGram) = List() i = 0 while i < limit and i < results.count() top.add(results.at(i)) i = i + 1

return top


Step 3: Similarity Analysis

Compare texts using Jaccard and other similarity metrics:

// file: similarity_analysis.zbr

// teaches: similarity metrics and comparison // project: Project-3-Data-Analysis

class SimilarityMetrics shared def jaccard_similarity(text1 as str, text2 as str) as float var words1 = text1.lower().split(" ") var words2 = text2.lower().split(" ")

# Find intersection var intersection = 0 for word1 in words1 for word2 in words2 if word1 == word2 intersection = intersection + 1 break

# Find union (crude approximation) var union = words1.count() + words2.count() - intersection

if union == 0 return 0.0

return intersection / union

def cosine_similarity(text1 as str, text2 as str) as float # Simplified cosine similarity (not true cosine, but similar) var words1 = text1.lower().split(" ") var words2 = text2.lower().split(" ")

var common = 0 for word1 in words1 for word2 in words2 if word1 == word2 common = common + 1

var len1 = words1.count() var len2 = words2.count()

if len1 == 0 or len2 == 0 return 0.0

var denominator = len1 + len2 return (2.0 * common) / denominator

def hamming_distance(text1 as str, text2 as str) as int var words1 = text1.lower().split(" ") var words2 = text2.lower().split(" ")

var max_len = words1.count() if words2.count() > max_len max_len = words2.count()

var distance = 0 var i = 0 while i < max_len var w1 = if i < words1.count() then words1.at(i) else "" var w2 = if i < words2.count() then words2.at(i) else ""

if w1 != w2 distance = distance + 1

i = i + 1

return distance


Step 4: Main Analysis Application

Tie together all analysis tools:

// file: analysis_main.zbr

// teaches: combining analysis modules // project: Project-3-Data-Analysis

class TextAnalysisReport var source_file as str var word_count as int var unique_words as int var top_words as List(WordFrequency) var bigrams as List(NGram) var trigrams as List(NGram)

class AnalysisApplication shared def analyze_file(filename as str) as Result(TextAnalysisReport, str) var content_result = File.read(filename)

if content_result.len == 0 return Result.err("File not found or empty")

var content = content_result var words = content.split(" ") var unique_words_set as HashMap(str, int) = HashMap()

for word in words var cleaned = word.lower().trim() if cleaned.len > 0 unique_words_set.put(cleaned, 1)

var freq_analyzer = FrequencyAnalyzer() var top_words = freq_analyzer.top_words(content, 10)

var bigram_analyzer = NGramAnalyzer() var bigrams = bigram_analyzer.top_ngrams(content, 2, 5) var trigrams = bigram_analyzer.top_ngrams(content, 3, 5)

var report = TextAnalysisReport() report.source_file = filename report.word_count = words.count() report.unique_words = unique_words_set.count() report.top_words = top_words report.bigrams = bigrams report.trigrams = trigrams

return Result.ok(report)

def print_report(report as TextAnalysisReport) print "==== Text Analysis Report ====" print "File: ${report.source_file}" print "Total words: ${report.word_count}" print "Unique words: ${report.unique_words}" print ""

print "Top 10 Words:" for wf in report.top_words print " ${wf.to_string()}" print ""

print "Top 5 Bigrams:" for bigram in report.bigrams print " ${bigram.gram} (${bigram.count})" print ""

print "Top 5 Trigrams:" for trigram in report.trigrams print " ${trigram.gram} (${trigram.count})"

class Main shared def main var result = AnalysisApplication.analyze_file("sample.txt")

if result.isOk() var report = result.okValue() AnalysisApplication.print_report(report) else print "Error: ${result.errValue()}"


Exercises

1. Find most common bigrams and trigrams — Already implemented in Step 2 2. Detect language patterns — Compare bigram distributions across texts 3. Compare two documents — Use similarity metrics from Step 3 4. Find suspicious passages — Identify sections with high similarity to other documents (plagiarism detection) 5. Build a frequency graph — Output word frequency distribution 6. Implement TF-IDF — Weight words by frequency and document uniqueness

Testing

<h1>Analyze a file</h1>

zebra analysis_main.zbr sample.txt

<h1>Compare two files</h1> zebra compare_files.zbr file1.txt file2.txt


Project Comparison Summary

| Feature | Project 1 | Project 2 | Project 3 | |---------|-----------|-----------|-----------| | Focus | File I/O + CLI | Networking + Routing | Algorithms + Data Structures | | Core Skill | Argument parsing, basic text processing | Network protocols, request handling | Complex algorithms, statistics | | Complexity | Beginner-Intermediate | Intermediate | Intermediate-Advanced | | Code Size | 300-400 lines | 400-600 lines | 350-500 lines | | Key Learning | Modules, error handling, Results | Servers, routing, state management | Data structures, sorting, metrics | | Time Estimate | 3-4 hours | 5-7 hours | 4-5 hours | | Real-World Use | Log analysis, text processing | API servers, web services | Data science, plagiarism detection |


Capstone Challenge: Integrated System

After completing all three projects, combine them:

class IntegratedSystem

shared def run_analysis_via_http(port as int, analysis_dir as str) # Start HTTP server (Project 2) # Serve text analysis results (Project 3) # Process files via CLI (Project 1)

# GET /health — Health check # POST /analyze — Upload and analyze file # GET /results — Retrieve analysis results # GET /compare — Compare two documents

This demonstrates: - ✅ Networking and servers - ✅ Complex data processing - ✅ CLI integration - ✅ Professional system design


Each project is a portfolio piece. Together, they demonstrate mastery of Zebra and modern programming fundamentals.


Project Comparison

| Feature | CLI Tool | HTTP Server | Data Analysis | |---------|----------|-------------|----------------| | Lines of code | 200-300 | 300-500 | 250-400 | | Main focus | File I/O | Networking | Algorithms | | Difficulty | Beginner | Intermediate | Intermediate | | Key learning | CLI, modules | Servers, protocols | Data structures | | Time to complete | 3-4 hours | 5-7 hours | 4-5 hours |


Progressive Difficulty

1. CLI Tool: Learn file I/O and basic structure 2. HTTP Server: Add networking and concurrency concepts 3. Data Analysis: Deep dive into algorithms and data structures

Each project reuses concepts from previous ones while introducing new challenges.


Capstone Challenges

After completing all three: 1. Integrate CLI tool with HTTP server (serve file statistics) 2. Use data analysis in HTTP endpoints 3. Build a combined system processing files via HTTP


Expected Outcomes

✅ Real-world program architecture ✅ Network programming fundamentals ✅ Advanced data structure manipulation ✅ Practical error handling ✅ Performance considerations ✅ Testing strategies


Each project is a portfolio piece demonstrating Zebra mastery.