๐ŸŽฏ What You'll Learn

  • How to debug common Dione platform issues
  • Performance troubleshooting techniques
  • Emergency procedures for production issues
  • Monitoring and alerting best practices
  • Tribal knowledge and unwritten rules

Common Issues and Solutions

1. Application Won't Start

๐Ÿšซ Symptom

JBoss fails to deploy applications

Common Causes:

  • Database connection failures
  • Missing JNDI resources
  • Spring context configuration errors
  • Cache connection issues

Debugging Steps

Log Analysis Commands
# 1. Check JBoss logs
tail -f /opt/jboss-eap-7.0/standalone/log/server.log

# 2. Look for these error patterns:
grep -i "ERROR" server.log | grep -E "(datasource|jndi|spring|cache)"

# 3. Check database connectivity
sqlcmd -S localhost -U ngop_user -P your_password -Q "SELECT COUNT(*) FROM Customer"

Common Fixes

Database Connection Fix
<datasource jndi-name="java:jboss/datasources/NGOPMSSQLDS" pool-name="NGOPMSSQLDS">
    <connection-url>jdbc:sqlserver://localhost:1433;databaseName=NGOP</connection-url>
    <driver>sqlserver</driver>
    <security>
        <user-name>ngop_user</user-name>
        <password>your_password</password>
    </security>
</datasource>

2. Session Issues

โš ๏ธ Symptom

Users getting "Session expired" errors immediately after login

Root Cause: Cache connectivity or session configuration

Session Debugging Code
// 1. Check cache configuration
LOG.info("Cache enabled: " + PropertyHandler.getProperty("infinispan.client.hotrod.enable"));
LOG.info("Cache server: " + PropertyHandler.getProperty("infinispan.client.hotrod.server_list"));

// 2. Test cache connectivity
try {
    cacheDataConsumer.put("test_key", "test_value", 60);
    String value = (String) cacheDataConsumer.get("test_key");
    LOG.info("Cache test successful: " + value);
} catch (Exception e) {
    LOG.error("Cache test failed", e);
}

// 3. Session debugging
public PojoMap verifyNGOPSession(RequestSessionHeader requestSessionHeader, 
                                Holder<ResponseStatusHeader> responseStatusHeader,
                                String orgCode, String custID) {
    
    String sessionId = requestSessionHeader.getSessionID();
    LOG.info("Verifying session: " + sessionId + " for org: " + orgCode);
    
    PojoMap sessionData = getSessionFromCache(sessionId, orgCode);
    if (sessionData == null) {
        LOG.warn("Session not found in cache: " + sessionId);
        throw new NGOPBaseException("Session expired");
    }
    
    LOG.info("Session found, customer: " + sessionData.getString("customerID"));
    return sessionData;
}

Performance Troubleshooting

Memory Issues

JVM Memory Analysis
# JBoss heap analysis
jstat -gc $JBOSS_PID 5s

# Look for these patterns:
# - Frequent full GC
# - Old generation constantly growing
# - Eden space filling too quickly

# Common fixes:
-Xms2g -Xmx4g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=512m
-XX:+UseG1GC -XX:MaxGCPauseMillis=200

Database Performance

SQL Server Performance Queries
-- Check for deadlocks
SELECT 
    session_id,
    blocking_session_id,
    wait_type,
    wait_time,
    wait_resource
FROM sys.dm_exec_requests 
WHERE blocking_session_id <> 0;

-- Common deadlock patterns:
-- 1. Customer update + Transaction insert
-- 2. Organization lookup + Customer creation
-- 3. Cache operations + Database updates

-- Missing organization filter (performance killer)
SELECT * FROM Customer WHERE Email = 'user@example.com';
-- Should be:
SELECT * FROM Customer WHERE Email = 'user@example.com' AND OrganizationID = 2;

-- Missing indexes
CREATE INDEX IX_Customer_Email_OrgID ON Customer(Email, OrganizationID);
CREATE INDEX IX_TransactionHistory_Customer ON TransactionHistory(CustomerID, OrganizationID);

Environment-Specific Issues

Development Environment

Development Configuration
# ngop.properties debugging values
log.level=DEBUG
hibernate.show_sql=true
infinispan.client.hotrod.enable=false  # Use local cache
ngop.serivce.3rdsecure.returnurl=http://localhost:8080/CustomerPortal

Production Issues

Production Troubleshooting
# Common production problems:

# 1. SSL certificate expiry
keytool -list -v -keystore $JAVA_HOME/jre/lib/security/cacerts | grep -A 5 "salesforce"

# 2. External service timeouts
grep -i "timeout" /opt/jboss-eap-7.0/standalone/log/server.log

# 3. Cache cluster split-brain
grep -i "cluster" /var/log/infinispan/infinispan.log

# 4. Database connection exhaustion
netstat -an | grep :1433 | wc -l

Monitoring and Alerting

Health Check Implementation

Custom Health Check Service
// Custom health check implementation
@WebService
public class HealthCheckService {
    
    @WebMethod
    public HealthStatus checkHealth() {
        HealthStatus status = new HealthStatus();
        
        // Database connectivity
        try {
            entityManager.createQuery("SELECT COUNT(*) FROM Customer").getSingleResult();
            status.setDatabaseStatus("OK");
        } catch (Exception e) {
            status.setDatabaseStatus("FAILED: " + e.getMessage());
        }
        
        // Cache connectivity
        try {
            cacheService.put("health_check", new Date());
            status.setCacheStatus("OK");
        } catch (Exception e) {
            status.setCacheStatus("FAILED: " + e.getMessage());
        }
        
        // External services
        status.setPricingEngineStatus(checkPricingEngine());
        status.setSalesforceStatus(checkSalesforce());
        
        return status;
    }
}

Key Metrics to Monitor

๐Ÿ“Š

Application Metrics

  • Response time 95th percentile < 2000ms
  • Error rate < 1%
  • Active sessions < 1000
  • Database connections < 80% of pool
๐Ÿ’ป

System Metrics

  • CPU usage < 80%
  • Memory usage < 85%
  • Disk usage < 90%
  • Network latency < 100ms
๐Ÿ’ฐ

Business Metrics

  • Login success rate > 98%
  • Payment success rate > 99%
  • External service availability > 99.5%

Emergency Procedures

Service Degradation Response

Emergency Response Steps
# 1. Identify the issue
grep -i "error\|exception\|failed" /opt/jboss-eap-7.0/standalone/log/server.log | tail -100

# 2. Check external dependencies
curl -I https://test.salesforce.com/services/Soap/c/38.0
curl -I http://172.31.4.4:8080/PricingEngine/

# 3. Cache reset (if session issues)
curl -X POST http://localhost:8080/DioneBusinessServices/service/cacheServices/clearCache

# 4. Database connection reset
/opt/jboss-eap-7.0/bin/jboss-cli.sh --connect --command="/subsystem=datasources/data-source=NGOPMSSQLDS:flush-all-connection-in-pool"

# 5. Application restart (last resort)
sudo systemctl restart jboss-eap

Rollback Procedures

Emergency Rollback
# 1. Application rollback
cd /opt/jboss-eap-7.0/standalone/deployments
cp backup/CustomerPortal.war.backup CustomerPortal.war

# 2. Database rollback (if schema changes)
sqlcmd -S localhost -d NGOP -i rollback_scripts/rollback_v2.0.7.sql

# 3. Configuration rollback
cp backup/ngop.properties.backup ngop.properties
sudo systemctl restart jboss-eap

Developer Productivity Tips

Fast Development Cycle

Development Shortcuts
# 1. Skip tests during development
mvn clean compile -Dmaven.test.skip=true

# 2. Hot deployment (if supported)
mvn compile war:exploded

# 3. Database reset script
sqlcmd -S localhost -d NGOP_DEV -Q "EXEC sp_reset_dev_data"

# 4. Cache clear for configuration changes
curl -X POST http://localhost:8080/DioneBusinessServices/service/cacheServices/clearCache

Useful Debugging Tools

๐Ÿงผ

SOAP UI

For testing web services - Import WSDL: http://localhost:8080/DioneBusinessServices/service/authServices?wsdl

๐Ÿ—„๏ธ

SQL Server Management Studio

Connect to: localhost,1433 for database debugging

๐Ÿ“Š

JConsole

For JVM monitoring: jconsole localhost:9999

๐Ÿ“

Log Analysis

tail -f server.log | grep -E "(ERROR|WARN|session|payment)"

๐ŸŽฏ Embracing the Reality

Dione is a functional, profitable, enterprise financial platform that has been serving customers for over a decade. It's not perfect, but it works. Your job is to:

  • Understand the business context - Financial services have unique requirements
  • Work with the existing architecture - Don't fight the SOAP services
  • Respect the legacy decisions - They solved real problems at the time
  • Improve incrementally - Small, safe changes over time
  • Document your workarounds - Help the next developer

Remember: The best code is code that works in production and makes money for the business. Dione does both.