Debugging & Survival Guide
Practical Troubleshooting and Tribal Knowledge
๐ฏ What You'll Learn
- How to debug common Dione platform issues
- Performance troubleshooting techniques
- Emergency procedures for production issues
- Monitoring and alerting best practices
- Tribal knowledge and unwritten rules
Common Issues and Solutions
1. Application Won't Start
๐ซ Symptom
JBoss fails to deploy applications
Common Causes:
- Database connection failures
- Missing JNDI resources
- Spring context configuration errors
- Cache connection issues
Debugging Steps
Log Analysis Commands
# 1. Check JBoss logs
tail -f /opt/jboss-eap-7.0/standalone/log/server.log
# 2. Look for these error patterns:
grep -i "ERROR" server.log | grep -E "(datasource|jndi|spring|cache)"
# 3. Check database connectivity
sqlcmd -S localhost -U ngop_user -P your_password -Q "SELECT COUNT(*) FROM Customer"
Common Fixes
Database Connection Fix
<datasource jndi-name="java:jboss/datasources/NGOPMSSQLDS" pool-name="NGOPMSSQLDS">
<connection-url>jdbc:sqlserver://localhost:1433;databaseName=NGOP</connection-url>
<driver>sqlserver</driver>
<security>
<user-name>ngop_user</user-name>
<password>your_password</password>
</security>
</datasource>
2. Session Issues
โ ๏ธ Symptom
Users getting "Session expired" errors immediately after login
Root Cause: Cache connectivity or session configuration
Session Debugging Code
// 1. Check cache configuration
LOG.info("Cache enabled: " + PropertyHandler.getProperty("infinispan.client.hotrod.enable"));
LOG.info("Cache server: " + PropertyHandler.getProperty("infinispan.client.hotrod.server_list"));
// 2. Test cache connectivity
try {
cacheDataConsumer.put("test_key", "test_value", 60);
String value = (String) cacheDataConsumer.get("test_key");
LOG.info("Cache test successful: " + value);
} catch (Exception e) {
LOG.error("Cache test failed", e);
}
// 3. Session debugging
public PojoMap verifyNGOPSession(RequestSessionHeader requestSessionHeader,
Holder<ResponseStatusHeader> responseStatusHeader,
String orgCode, String custID) {
String sessionId = requestSessionHeader.getSessionID();
LOG.info("Verifying session: " + sessionId + " for org: " + orgCode);
PojoMap sessionData = getSessionFromCache(sessionId, orgCode);
if (sessionData == null) {
LOG.warn("Session not found in cache: " + sessionId);
throw new NGOPBaseException("Session expired");
}
LOG.info("Session found, customer: " + sessionData.getString("customerID"));
return sessionData;
}
Performance Troubleshooting
Memory Issues
JVM Memory Analysis
# JBoss heap analysis
jstat -gc $JBOSS_PID 5s
# Look for these patterns:
# - Frequent full GC
# - Old generation constantly growing
# - Eden space filling too quickly
# Common fixes:
-Xms2g -Xmx4g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=512m
-XX:+UseG1GC -XX:MaxGCPauseMillis=200
Database Performance
SQL Server Performance Queries
-- Check for deadlocks
SELECT
session_id,
blocking_session_id,
wait_type,
wait_time,
wait_resource
FROM sys.dm_exec_requests
WHERE blocking_session_id <> 0;
-- Common deadlock patterns:
-- 1. Customer update + Transaction insert
-- 2. Organization lookup + Customer creation
-- 3. Cache operations + Database updates
-- Missing organization filter (performance killer)
SELECT * FROM Customer WHERE Email = 'user@example.com';
-- Should be:
SELECT * FROM Customer WHERE Email = 'user@example.com' AND OrganizationID = 2;
-- Missing indexes
CREATE INDEX IX_Customer_Email_OrgID ON Customer(Email, OrganizationID);
CREATE INDEX IX_TransactionHistory_Customer ON TransactionHistory(CustomerID, OrganizationID);
Environment-Specific Issues
Development Environment
Development Configuration
# ngop.properties debugging values
log.level=DEBUG
hibernate.show_sql=true
infinispan.client.hotrod.enable=false # Use local cache
ngop.serivce.3rdsecure.returnurl=http://localhost:8080/CustomerPortal
Production Issues
Production Troubleshooting
# Common production problems:
# 1. SSL certificate expiry
keytool -list -v -keystore $JAVA_HOME/jre/lib/security/cacerts | grep -A 5 "salesforce"
# 2. External service timeouts
grep -i "timeout" /opt/jboss-eap-7.0/standalone/log/server.log
# 3. Cache cluster split-brain
grep -i "cluster" /var/log/infinispan/infinispan.log
# 4. Database connection exhaustion
netstat -an | grep :1433 | wc -l
Monitoring and Alerting
Health Check Implementation
Custom Health Check Service
// Custom health check implementation
@WebService
public class HealthCheckService {
@WebMethod
public HealthStatus checkHealth() {
HealthStatus status = new HealthStatus();
// Database connectivity
try {
entityManager.createQuery("SELECT COUNT(*) FROM Customer").getSingleResult();
status.setDatabaseStatus("OK");
} catch (Exception e) {
status.setDatabaseStatus("FAILED: " + e.getMessage());
}
// Cache connectivity
try {
cacheService.put("health_check", new Date());
status.setCacheStatus("OK");
} catch (Exception e) {
status.setCacheStatus("FAILED: " + e.getMessage());
}
// External services
status.setPricingEngineStatus(checkPricingEngine());
status.setSalesforceStatus(checkSalesforce());
return status;
}
}
Key Metrics to Monitor
Application Metrics
- Response time 95th percentile < 2000ms
- Error rate < 1%
- Active sessions < 1000
- Database connections < 80% of pool
System Metrics
- CPU usage < 80%
- Memory usage < 85%
- Disk usage < 90%
- Network latency < 100ms
Business Metrics
- Login success rate > 98%
- Payment success rate > 99%
- External service availability > 99.5%
Emergency Procedures
Service Degradation Response
Emergency Response Steps
# 1. Identify the issue
grep -i "error\|exception\|failed" /opt/jboss-eap-7.0/standalone/log/server.log | tail -100
# 2. Check external dependencies
curl -I https://test.salesforce.com/services/Soap/c/38.0
curl -I http://172.31.4.4:8080/PricingEngine/
# 3. Cache reset (if session issues)
curl -X POST http://localhost:8080/DioneBusinessServices/service/cacheServices/clearCache
# 4. Database connection reset
/opt/jboss-eap-7.0/bin/jboss-cli.sh --connect --command="/subsystem=datasources/data-source=NGOPMSSQLDS:flush-all-connection-in-pool"
# 5. Application restart (last resort)
sudo systemctl restart jboss-eap
Rollback Procedures
Emergency Rollback
# 1. Application rollback
cd /opt/jboss-eap-7.0/standalone/deployments
cp backup/CustomerPortal.war.backup CustomerPortal.war
# 2. Database rollback (if schema changes)
sqlcmd -S localhost -d NGOP -i rollback_scripts/rollback_v2.0.7.sql
# 3. Configuration rollback
cp backup/ngop.properties.backup ngop.properties
sudo systemctl restart jboss-eap
Developer Productivity Tips
Fast Development Cycle
Development Shortcuts
# 1. Skip tests during development
mvn clean compile -Dmaven.test.skip=true
# 2. Hot deployment (if supported)
mvn compile war:exploded
# 3. Database reset script
sqlcmd -S localhost -d NGOP_DEV -Q "EXEC sp_reset_dev_data"
# 4. Cache clear for configuration changes
curl -X POST http://localhost:8080/DioneBusinessServices/service/cacheServices/clearCache
Useful Debugging Tools
SOAP UI
For testing web services - Import WSDL: http://localhost:8080/DioneBusinessServices/service/authServices?wsdl
SQL Server Management Studio
Connect to: localhost,1433 for database debugging
JConsole
For JVM monitoring: jconsole localhost:9999
Log Analysis
tail -f server.log | grep -E "(ERROR|WARN|session|payment)"
๐ฏ Embracing the Reality
Dione is a functional, profitable, enterprise financial platform that has been serving customers for over a decade. It's not perfect, but it works. Your job is to:
- Understand the business context - Financial services have unique requirements
- Work with the existing architecture - Don't fight the SOAP services
- Respect the legacy decisions - They solved real problems at the time
- Improve incrementally - Small, safe changes over time
- Document your workarounds - Help the next developer
Remember: The best code is code that works in production and makes money for the business. Dione does both.